Why BLOB-field is under-field, and how to make it field
Multimedia data today is rapidly growing - pictures, music, movies. Modern tendency - to store them in database, instead of in file system. It is plainly inconvenient to save science graphic data outside database, because they are accompanied by great quantity of additional numerical data. Two most typical operations with BLOB are:
- downloading (from server, including from DBMS)
- accepting (from client, which submit it by own initiative)
Let's list, what task they are create, and let's propose, how to solve them.
Downloading
Connections are broken sometimes. Besides this, we might need to reboot computer-receiver, or to switch off notebook-receiver. So it is necessary to save in field:
- external address of BLOB to continue downloading
- already downloaded part of BLOB - and even parts, if DBMS can divide remote BLOB into several parts (like usual programs for downloading, e.g. "ReGet"), and download each part via separate TCP-connection
- offsets of beginning and of end for each part to continue downloading and assembling into one BLOB
And we need operators:
- returning BLOB
- returning percents of completeness of downloading
- returning address of BLOB on remote server (instead of NULL to inform anybody, asking for this BLOB, that downloading is not finished yet, and where it is possible to obtain BLOB before finishing downloading, or if DBMS will be destroyed)
- starting downloading in background mode (i.e. allowing next SQL-operators to be executed during downloading), and returning, downloading is started or not
Accepting
We cannot continue accepting after interrupt of TCP-connection, after rebooting or switching off - we can only rely, that connection will not break, computer will not freeze, and that UPS will hold to the end of transferring. But if anybody will ask DBMS to give this BLOB during accepting, what kind of feedback DBMS should provide? Answering NULL, it notes, that it knows nothing about BLOB, which is false, because accepting BLOB is only a matter of time. So, that field must save:
- already accepted part of BLOB
- meaning "ACCEPTING", similar to NULL, but not equal to it
Let third database copy BLOB-field into itself, while our database still accepting it. May third database contain "ACCEPTING"? Obviously no, because it will looks like third database itself is accepting BLOB. So third database would misinform own users, and will inspire false hopes to them. Thus we need once more meaning, similar to NULL, e.g.
into which "ACCEPTING" is turned during copying. NOTGOT is turned into NOTGOT during copying.
Both cases
It is necessary to note, that BLOB is often picture, which should be displayed in client (e.g. in browser). And it would be very inconveniently, if address would contain name of BLOB-field (to distinguish it from other BLOB-fields), field name of primary key (to specify record), and table name
[1]. And how to write this address? It will be quite not similar to URL, besides client must not know SQL. So we see, that
- identifier, unique in whole database
must be placed into BLOB-field at once, when BLOB is accepted or downloaded wholly. Placing should be performed by DBMS automatically, without participation of programmer
[2].
In general, BLOB - is a file, and as each file, it has type - jpg, mpg, etc. Files of some formats contain file type specification in just beginning of file, files of other formats - do not contain, and in this case, file type must be stored in BLOB-field, and be copied from field to field at copying BLOB
[3].
Resume
Let's summarize: BLOB-field must be capable to contain one of the following
sets of meanings:
or
- meaning ACCEPTING
- file type
- one part (beginning) of BLOB
or
or
- identifier, unique in whole database
- file type
- whole BLOB
or
- URL [4]
- file type
- several parts of BLOB
- offsets of beginning and of end for each part of BLOB
Operators, mentioned above, will looks so:
- "SELECT fieldname ..." - to extract BLOB
[5]
- "SELECT ID(fieldname) ..." - to extract BLOB idenfifier
- "... WHERE fieldname=542" - to specify record by BLOB identifier, equal 542
- "SELECT PERCENT(fieldname) ..." - to extract percents of completeness of downloading
- "SELECT TYPE(fieldname) ..." - to extract type of BLOB
[5]
- "SELECT DOWNLOAD(fieldname) ..." - to start download in background mode and to inform, downloading is started or not (i.e. returns Y/N)
- "SELECT STOP(fieldname) FROM * WHERE fieldname=542" - to stop downloading BLOB with identifier, equal 542
[1] Supposing, that name of scheme is united together with table name into one token
[2] It would be very foolish, if programmer would must to create sequence for BLOBs, and to create trigger per each BLOB-field in database
[3] It would be good, if DBMS would understand great quantity of formats, and would correct file type in BLOB-field to meaning from file content
[4] Like "site.com/~dbowner/dbname/identifier" or "111.222.333.444/~dbowner/dbname/identifier"
[5] It is necessary to know BLOB type only to select BLOBs for converting from one format to another.
BLOB type must be extracted automatically together with BLOB at transferring BLOB,
so it must be unnecessary to request BLOB type separately.
Besides this, BLOBs should be transferred not in a record, but after all records.
Otherwise they will block reaction of acceptor at fast transferred non-BLOB fields.
<tablename fieldname1= fieldname="3652435"/>
<?file value="3652435" type="mpg" size="3">Y29</?file>
P.S.
Article is guide to implement ideas of
p.167-175 of
pdf-document.
Besides this, to create peer-to-peer network like BitTorrent, Kademlia and so on by force,
commands can be appended to enquiry quantity of downloading parts and their beginnings and ends
select reckon( @field ) from ... ; -- returns quantity of blob parts
select beginning( @field, 5) from ... ; -- beginnings of 5th part
select end( @field, 5) from ... ; -- ends of 5th part
Dima Turin, dmitryturin@yandex.ru
Используются технологии
uCoz