Storing a lot of items¶
The problem¶
If you have a lot (=hundreds of thousands or more) of items to store (experimental
sentences, tweets, audio fragments, individual calculations, etc), it
might be tempting to store them as individual files so the file system
more or less reflects your data structure. This has the important
downside, however, that the file system will become incredibly slow: a
simple command like ls
is no longer instant, loading your data with a
script might take ages, and most importantly: it will be practically
impossible to move around your files, for example if they need to be
moved to another disk for maintenance reasons.
The solution¶
There are multiple solutions
- Store everything in larger files, where each line is one item
- Use an sqlite or mysql database. Mysql databases can be requested via the admin.