03/19/2020 (Thu) 04:03:44
True tagable collections/galleries, that are batched together in memory, and the ability to index a gallery/collection based on a specific (tagspace? I forget the term). For instance, a collection of manga pages in a folder, sorted by page number tags, in a volume collection/gallery. Then volume collections/galleries can be grouped into a set of series collections/galleries. And each gallery/collaction group could have it's own tags, in addition to specific tags on the contents. You could also display this galleriy as a folder/4 frame contents pane to reduce the number of thumbnails of like content that need to be displayed for such content. This would also ease loading and caching if you could set a sorting/indexing tag for the collection to store them contiguously in memory, rather than being random access, as most paged content will typically be accessed in series/sorted schemes. A "deep search" could also search for tag matches in the gallery contents as well, but you would knowledge that such a search would take more time. At the moment this is something that files systems actually do better natively than hydrus, but I love the Hydrus system for tagging and storing random disparate content, and even disparate franchises/paged content. Admittedly, one of the concerns of such a system as it becomes quite large would be hash look ups at all the different branches and levels of galleries, but I think as long as the shallow lookup is fast, and you pull in the media with the proper tags to look it up in the sorted gallery index or hash, it would still be relatively fast while providing a massive performance boost for users with a large amount of paged content? As it is, it's hard for me to even clearly see where one manga ends, and another starts, even displaying the titles on the thumbnails. I would really appreciate this improvement for paged content and collections.
Hopefully I've written this to be understandable. I realize this would be a large change to how Hydrus does it's data storage and lookups, so maybe just start with implementing folders/galleries/collections that then just hash their own contents. That would essentially just be a nesting of the existing data structure with recursive searches and lookups in the children structures. Of course, if you create a lot of these you loose the benefits of O(1) lookup, so maybe it isn't the greatest solution, as in the worst case I think this could decay to linear time, but perhaps we can add to subdirectories with some file containing all the added hashes in it? Then we would just have to look at the file rather than going through all the children in the directory... but then that leaves us with a problem of how to sort that file, which we will probably want to do with a hash table of some sort that we serialize to a binary format, starting with a capacity, and using a good primes array to expand the binary file as needed. Chaining is also harder to do in binary file due to how it is stored, so maybe using closed hashing could make sense? Or we just eat the overhead for implementing a tree scheme in the binary that points to indices in the file using a linked list style architecture, though memory would be considerably larger I think. Thus a better solution might very well be to just resize a closed hashing table immediately every time we reach 1/2 capacity. Thoughts? How does hydrus currently handle it's internal hash table?
Also, can you implement a fuzzy tag search that matches sub strings? It gets really annoying having to include the namespace prefix when I want to look for something like "memes".