/hydrus/ - Version 448

Name
Email
Subject
Comment
Password
Drawing	x size canvas
File(s)	Drag files to upload or click here to select them
Spoiler

Version 448 Anonymous Board owner 07/28/2021 (Wed) 22:02:45 Id: ae2aa4 [Preview] No. 1108

https://youtube.com/watch?v=VEwVAV3VPw4 [Embed]
windows
zip: https://github.com/hydrusnetwork/hydrus/releases/download/v448/Hydrus.Network.448.-.Windows.-.Extract.only.zip
exe: https://github.com/hydrusnetwork/hydrus/releases/download/v448/Hydrus.Network.448.-.Windows.-.Installer.exe
macOS
app: https://github.com/hydrusnetwork/hydrus/releases/download/v448/Hydrus.Network.448.-.macOS.-.App.dmg
linux
tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v448/Hydrus.Network.448.-.Linux.-.Executable.tar.gz

Hey, I am sorry, Endchan has been down or not allowed login/posting the past couple weeks when I needed to get release out, so there was a gap.

I had an ok couple of weeks. I was pretty ill in the middle, but I got some good work done overall. .wav files are now supported, PSD files get thumbnails, vacuum returns, and the Client API allows much cleverer search.

client api

I have added some features to the Client API. It was more complicated than I expected, so I couldn't get everything I wanted done, but I think this is a decent step forward.

First off, the main 'search for files' routine now supports many system predicates. This is thanks to a user who wrote a great system predicate text parser a long time ago. I regret I am only catching up with his work now, since it works great. I expect to roll it into normal autocomplete typing as well--letting you type 'system:width<500' and actually getting the full predicate object in the results list to select.

If you are working with the Client API, please check out the extended help here:

https://hydrusnetwork.github.io/hydrus/help/client_api.html#get_files_search_files

There's a giant list of the current supported inputs. You'll just be submitting system predicates as text, and it handles the rest. Please note that this is a complicated system, and while I have plenty of unit tests and so on, if you discover predicates that should parse but are giving errors or any other jank behaviour, please let me know!

Next step here is to add file sort and file/tag domain.

Next there's a routine that lets you add files to arbitrary pages, just like a thumbnail drag and drop:

https://hydrusnetwork.github.io/hydrus/help/client_api.html#manage_pages_add_files

This is limited to currently open pages for now, but I will add a command to create an empty file page so you can implement an external file importer page.

misc

.wav files are now supported! They should work fine in mpv as well.

Simple PSD files now get thumbnails! It turns out FFMPEG can figure this out as long as the PSD isn't too complicated, so I've done it like for .swf files--if it works, the PSD gets a nice thumbnail, and if it doesn't it gets the file default icon stretched to the correct ratio. When you update, all existing PSDs will be queued for a thumbnail regen, so they should sort themselves out in the background.

Thanks to profiles users sent in, I optimised some database code. Repository processing and large file deletes should be a little faster. I had a good look at some slow session save profiles--having hundreds of thousands of URLs in downloader pages currently eats a ton of CPU during session autosave--but the solution will require two rounds of significant work.

Database vacuum returns as a manual job. I disabled this a month or so ago as it was always a rude sledgehammer that never actually achieved all that much. Now there is some UI under database->db maintenance->review vacuum data that shows each database file separately with their current free space (i.e. what a vacuum will recover), whether it looks like you have enough space to vacuum, an estimate of vacuum time, and then the option to vacuum on a per file basis. If you recently deleted the PTR, please check it out, as you may be able to recover a whole ton of disk space.

I fixed Mr Bones! I knew I'd typo somewhere with the file service rewrite two weeks ago, and he got it. I hadn't realised how popular he was, so I've added him to my weekly test suite--it shouldn't happen again.

Anonymous Board owner 07/28/2021 (Wed) 22:03:22 Id: ae2aa4 [Preview] No.1109 del

full list

- client api:
- /get_files/search_files now supports most system predicates! simply submit normal system predicate text in your taglist (check the expanded api help for a list of what is supported now) and they should be converted to proper system preds automatically. anything that doesn't parse will give 400 response. this is thanks to a user that submitted a system predicate parser a long time ago and which I did not catch up on until now. with this framework established, in future I will be able to add more predicate types and allow this parsing in normal autocomplete typing (issue #351)
- this is a complicated system with many possible inputs and outputs! I have tried to convert all the object types over and fill out unit tests, but there are likely some typos or bad error handling for some unusual predicates. let me know what problems you run into, and I'll fix it up!
- the old system_inbox and system_archive parameters on /get_files/search_files are now obselete. they still work, but I recommend you just use tags now. I'll deprecate them fully in future
- /get_files/search_files now disables the implicit system limit that most clients apply to all searches (by default, 10,000), so if you ask for a million files, you'll (eventually) get it
- a new call /manage_pages/add_files now allows you to add files to any media page, just like a file drag and drop
- in the /get_files/file_metadata call, the tag lists in the different 'statuses' Objects are now human-sorted
- added a link to https://github.com/floogulinc/hyextract to the client api help. this lets you extract from imported archives and reimport with tags and URLs
- the client api is now ok if you POST with a utf-8 charset content-type "application/json;charset=utf-8"
- the client api now tests the types of items within list parameters (e.g. file_ids should be a list of _integers_), raising an appropriate exception if they are incorrect
- client api version is now 18
- .
- misc:
- hydrus now supports wave (.wav) audio files! they play in mpv fine too
- simple psd files now have thumbnails! complicated ones will get a stretched version of the old default psd filetype thumbnail, much like how flash works. all your psd files are queued up for thumbnail regen on update, so they should figure themselves out in the background. this is thanks to ffmpeg, which it turns out can handle simple psds!
- vacuum returns as a manual operation. there's some new gui under _database->db maintenance->review vacuum data_. it talks about vacuum, shows current free space for each file, gives an estimate of how long vacuum will take, and allows you to launch vacuum on particular files
- the 'maintenance and processing' option that checks CPU usage for 'system busy' status now lets you choose how many CPU cores must exceed the % value (previously, one core exceeding the value would cause 'busy'). maybe 4 > 25% is more useful than 1 > 50% in some situations?
- removed the warning when updating from v411-v436. user reports and more study suggest this range was most likely ok in the end!
- double-clicking the autocomplete tag list, or the current/pending/etc.. buttons, should now restore keyboard focus back to the text input afterwards, in float mode or not
- the thumbnail 'remote services' menu, if you have file repositories or ipfs services, now appears on the top level, just below 'manage'
- the file maintenance menu is shuffled up the 'database' menubar menu
- fixed mr bones! I knew I was going to make a file status typo in 447, and he got it

Anonymous Board owner 07/28/2021 (Wed) 22:04:41 Id: ae2aa4 [Preview] No.1110 del

- in the downloader system, if a download object has any hashes, it now no longer consults urls for pre-import predictions. this saves a little time looking up urls and ensures that the logically stronger hashes take precedence over urls in all cases (previously, they only took precedence when a non-'looks new' status was found)
- fixed an ugly bug in manage tag siblings/parents where tags imported from clipboard or .txt were not being cleaned, so all sorts of garbage with capital letters or leading spaces could be entered. all pairs are now cleaned, and anything invalid skipped over
- the manage tag filter dialog now cleans all imported tag rules when using the 'import' button (issue #768)
- the manage tag filter dialog now allows you to export the current tag filter with the export button
- fixed the 'edit json parse rule' dialog layout so if you transition from a short display to a string match that has complicated controls, it should now expand properly to show them all
- I think I fixed an odd bug where when uploading pending mappings while more mappings were being added, the x/y progress could accurately but unhelpfully continually reset to 0/y, with an ever-decreasing y until it was equal to the value it had at start. y should now always grow
- hydrus servers now put their server header on a second header 'Hydrus-Server', which should allow them to be properly detectable through a proxy that overrides 'Server'
- optimised a critical call in the tag mappings update database routine. for a service with many siblings and parents, I estimate repository processing is 2-7% faster
- optimised the 'add/delete file' database routines in multiple ways, particularly when the file(s) have many deleted tags, and for the local file services, and when the client has multiple tag services
- brushed up a couple of system predicate texts--things like num_pixels to 'number of pixels'
- .
- boring database refactoring:
- repository update file tracking and service id normalisation is now pulled out to a new 'repositories' database module
- file maintenance tracking and database-level file info updates is now pulled out to a new 'files maintenance' database module
- analyse and vacuum tracking and information generation is now pulled out to a new 'db maintenance' database module
- moved more commands to the 'similar files' module
- the 'metadata regeneration' file maintenance job is now a little faster to save back to the database
- cleared out some defunct/bad database code related to these two modules
- misc code cleanup, particularly around the stuff I optimised this week

next week

Next week is a 'medium job' week. To clear out some long time legacy issues, I want to figure out an efficient way to reset and re-do repository processing just for siblings and parents. If that goes well, I'll put some more time into the Client API.

Experimental Release Tomorrow! Anonymous Board owner 08/04/2021 (Wed) 04:02:06 Id: cc1d47 [Preview] No.1113 del

I had a mixed week. I completed a long delayed maintenance routine for repositories, letting them track tags, siblings, and parents processing separately, but it proved much more complicated than I expected, and while I am happy with the work, I have nothing else to show. Since the change also touches core areas of repository processing, I want to do a limited beta test before I roll it out to everyone.

The release should be normal time tomorrow, but it will be an experimental release, only recommended for advanced users.

Version 449 (Experimental) Anonymous Board owner 08/04/2021 (Wed) 22:31:22 Id: e0032d [Preview] No.1114 del

https://youtube.com/watch?v=boFZ3cAws20 [Embed]
windows
zip: https://github.com/hydrusnetwork/hydrus/releases/download/v449/Hydrus.Network.449.-.Windows.-.Extract.only.zip
exe: https://github.com/hydrusnetwork/hydrus/releases/download/v449/Hydrus.Network.449.-.Windows.-.Installer.exe
macOS
app: https://github.com/hydrusnetwork/hydrus/releases/download/v449/Hydrus.Network.449.-.macOS.-.App.dmg
linux
tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v449/Hydrus.Network.449.-.Linux.-.Executable.tar.gz

I had a mixed week. I was able to get a long-planned maintenance routine completed, but that's all I have to show. This is an experimental release, only for advanced users who want to help me test it.

repository processing tracking

Since I haven't got anything really fun this week, and this changes something delicate, I only want advanced users to check it out for now. If you have experience with the program, run a regular backup, sync with the PTR or another repository, and want to help me out, then please update this week and use your repository normally. Let me know if you run into any trouble. One thing I noticed just now is my IRL client didn't want to catch up to some final processing until I restarted it.

Note this update will delete your pending siblings and parents, so commit before you update! I'll make it so it doesn't do this next week.

So, repositories now track their 'processing' status more cleverly. Hit services->review services to see--now the 'definitions' part of an update is separated, and the different contents a repository has (just files for a file repo, but mappings, siblings, and parents for a tag repo) also have separate tracking and pause buttons. Most of the time you'll see everything at the same progress, but now the client can do independent reset, so 'clear all siblings and then reprocess them', and it doesn't have to nuke and work through all your mappings as well.

This sounds simple, but it turns out it touches a bunch of core systems, and many were old and dusty. I brushed everything up, maybe fixed some little bugs or lags along the way, and added some neat reprocess commands to the 'review services' panel. All siblings and parents will be reset this week--part of a long-time problem with non-determinant sibling/parent processing I have been trying to pin down with the PTR janitors--but doing this reset now just takes a couple of seconds and shouldn't take more than a minute to reprocess.

There are some secondary cool things here--users can potentially now sync with the PTR just for the siblings and parents. It is still a little inefficient, since you are getting the tens millions of definitions no matter what, but you can skip the 1.3 billion mappings if you want. I also feel more able to hang new tools off it like a tag filter (e.g. 'get all the creator tags from PTR, but nothing else') in future.

full list

- this is an experimental release! please do not use this unless you are an advanced user who has a good backup, syncs with a repository (e.g. the PTR), and would like to help me out. if this is you, I don't need you to do anything too special, just please use the client and repo as normal, downloading and uploading, and let me know if anything messes up during normal operation
- repository processing split:
- tl;dr: nothing has changed, you don't have to do anything. with luck, your PTR service is going to fix some bad siblings and parents over the next couple of days
- repositories now track what they have processed on a per-content basis. this gives some extra maintenance tools to, for instance, quickly reset and reprocess your ~150k tag siblings on the PTR without having to clear and reprocess all 1.3 billion mappings too

Anonymous Board owner 08/04/2021 (Wed) 22:31:47 Id: e0032d [Preview] No.1115 del

- in review services, you now see definition updates and all a repository's content types processing progress independently (just files for a file repo, but mappings, siblings, and parents for a tag repo). most of the time they will all be the same, but each can be paused separately. it is now possible (though not yet super efficient, since definitions still run 100%) to sync with the PTR and only grab siblings and parents by simply pausing mappings in review services
- I have also split the 'network' and 'processing' sync progress gauges and their buttons into separate boxes for clarity
- the 'fill in content gaps' maintenance job now lets you choose which content types to do it for
- also, a new 'reset content' maintenance job lets you choose to delete and reprocess by content type. the nuclear 'complete' reset is now really just for critical situations where definitions or database tables are irrevocably messed up
- all users have their siblings and parents processing reset this week. next time you have update processing, they'll come back in over about fifteen minutes, and with luck we'll wipe out some years-old legacy bugs and hopefully discover some info about the remaining bugs. most importantly, we can re-trigger this reprocess in just a few seconds to quickly test future fixes
- a variety of tests such as 'service is mostly caught up' are now careful just to test for the currently unpaused content types
- if you try to commit some content that is currently processing-paused, the client now says 'hey, sorry this is paused, I won't upload that stuff right now' but still upload everything else that isn't paused. this is a ' service is caught up' issue
- tag display sync, which does background work to make sure siblings and parents appear as they should, will now not run for a service if any of the services it relies on for siblings or parents is not process synced. when this happens, it is also shown on the tag display sync review panel. this stops big changes like the new sibling/parent reset from causing display sync to do a whole bunch of work before the service is ready and happy with what it has. with luck it will also smooth out new users' first PTR sync too
- clients now process the sub-updates of a repository update step in the order they were generated on the server, which _may_ fix some non-determinant update bugs we are trying to pin down
- all update processing tracking is overhauled. all related code and database access techniques have been brushed up and should use less CPU and fail more gracefully

next week

This work knocked me out. I had half hoped it would be a simple little thing, just splitting one x/y into multiple, but instead it spiralled out into ten different 'ah, but what about that?' and 'man, that's actually been running bad for ages'. Rather than kick out garbage on a core system, I decided to give it some proper time and do extra IRL testing. However, I am behind on messages, recent bug reports, other small work, and the Client API, so I'll now get to that.

Release Tomorrow! Anonymous Board owner 08/11/2021 (Wed) 03:32:22 Id: e3465e [Preview] No.1116 del

I had an ok week. The update storage change last week went well, so that is polished and ready for everyone. I also caught up on some small fixes and quality of life and extended the Client API a little further.

The release should be normal time tomorrow.

Posting mode: Reply