/hydrus/ - Version 445

Name
Email
Subject
Comment
Password
Drawing	x size canvas
File(s)	Drag files to upload or click here to select them
Spoiler

Version 445 Anonymous Board owner 06/30/2021 (Wed) 22:38:46 Id: a5f35a [Preview] No. 1094

https://youtube.com/watch?v=ulvvQHD4184 [Embed]
windows
zip: https://github.com/hydrusnetwork/hydrus/releases/download/v445/Hydrus.Network.445.-.Windows.-.Extract.only.zip
exe: https://github.com/hydrusnetwork/hydrus/releases/download/v445/Hydrus.Network.445.-.Windows.-.Installer.exe
macOS
app: https://github.com/hydrusnetwork/hydrus/releases/download/v445/Hydrus.Network.445.-.macOS.-.App.dmg
linux
tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v445/Hydrus.Network.445.-.Linux.-.Executable.tar.gz

I had a great week mostly working on optimisations and cleanup. A big busy client running a lot of importers should be a little snappier today.

optimisations

Several users have had bad UI hangs recently, sometimes for several seconds. It is correlated with running many downloaders at once, so with their help I gathered some profiles of what was going on and trimmed and rearranged some of the ways downloaders and file imports work this week. There is now less stress on the database when lots of things are going on at once, and all the code here is a little more sensible for future improvements. I do not think I have fixed the hangs, but they may be less bad overall, or the hang may have been pushed to a specific trigger like file loads or similar.

So there is still more to do. The main problem I believe is that I designed the latest version of the downloader engine before we even had multiple downloaders per page. An assumed max of about twenty download queues is baked into the system, whereas many users may have a hundred or more sitting around, sometimes finished/paused, but in the current system each still taking up a little overhead CPU on certain update calls. A complete overhaul of this system is long overdue but will be a large job, so I'm going to focus on chipping away at the worst offenders in the meantime.

As a result, I have improved some of the profiling code. The 'callto' profile mode now records the UI-side of background jobs (when they publish their results, usually), and the 'review threads' debug dialog now shows detailed information on the newer job scheduler system, which I believe is being overwhelmed by micro downloader jobs in heavy clients. I hope these will help as I continue working with the users who have had trouble, so please let me know how you get on this week and we'll give it another round.

the rest

I fixed some crazy add/delete logic bugs in the filename tagging dialog and its 'tags just for selected files' list. Tag removes will stick better and work more precisely on the current selection.

If you upload tags to the PTR and notice some lag after it finishes, this should be fixed now. A safety routine that verifies everything is uploaded and counted correct was not working efficiently.

I fixed viewing extremely small images (like 1x1) in the media viewer. The new tiled renderer had a problem with zooms greater than 76800%, ha ha ha.

A bunch of sites with weird encodings (mostly old or japanese) should now work in the downloader system.

Added a link, https://github.com/GoAwayNow/Iwara-Hydrus, to Iwara-Hydrus, a userscript to simplify sending Iwara videos to Hydrus Network, to the Client API help.

If you are a Windows user, you should be able to run the client if it is installed on a network location again. This broke around v439, when we moved to the new github build. It was a build issue with some new libraries.

Anonymous Board owner 06/30/2021 (Wed) 22:39:17 Id: a5f35a [Preview] No.1095 del

full list

- misc:
- fixed some weird bugs on the pathname tagging dialog related to removal and re-adding of tags with its 'tags just for selected files' list. previously, in some circumstances, all selected paths could accidentally share the same list of tags, so further edits on a subset selection could affect the entire former selection
- furthermore, removing a tag from that list when the current path selection has differing tags should now successfully just remove that tag and not accidentally add anything
- if your client has a pending menu with 'sticky' small tag count that does not seem to clear, the client now tries to recognise a specific miscount cause for this situation and gives you a little popup with instructions on the correct maintenance routine to fix it
- when pending upload ends, it is now more careful about when it clears the pending count. this is a safety routine, but it not always needed
- when pending count is recalculated from source, it now uses the older method of counting table rows again. the new 'optimised' count, which works great for current mappings, was working relatively very slow for pending count for large services like the PTR
- fixed rendering images at >76800% zoom (usually 1x1 pixels in the media viewer), which had broke with the tile renderer
- improved the serialised png load fix from last week--it now covers more situations
- added a link, https://github.com/GoAwayNow/Iwara-Hydrus, to Iwara-Hydrus, a userscript to simplify sending Iwara videos to Hydrus Network, to the client api help
- it should now again be possible to run the client on Windows when the exe is in a network location. it was a build issue related to modern versions of pyinstaller and shiboken2
- thanks to a user's help, the UPnPc executable discoverer now searches your PATH, and also searches for 'upnpc' executable name as a possible alternative on linux and macOS
- also thanks to a user, the test script process now exits with code 1 if the test is not OK
- .
- optimisations:
- when a db job is reading data, if that db job happens to fall on a transaction boundary, the result is now returned before the transaction is committed. this should reduce random job lag when the client is busy
- greatly reduced the amount of database time it takes to check if a file is 'already in db'. the db lookup here is pretty much always less than a millisecond, but the program double-checks against your actual file store (so it can neatly and silently fill in missing files with regular imports), however on an HDD with a couple million files, this could often be a 20ms request! (some user profiles I saw were 200ms!!! I presume this was high latency drives, and/or NAS storage, that was also very busy at the time). since many download queues will have bursts of a page or more of 'already in db' results (from url or hash lookups), this is why they typically only run 30-50 import items a second these days, and until this week, why this situation was blatting the db so hard. the path existence disk request is pulled out of precious db time, allowing other jobs to do other db work while the importer can wait for disk I/O on its thread. I suspect the key to getting the 20ms down to 8ms will be future granulation of the file store (more than 256 folders when you have more than x files per folder, etc...), which I have plans for. I know this change will de-clunk db access when a lot of importers are working, but we'll see this week if the queues actually process a little faster since they can now do file presence checks in parallel and with luck the OS/disk will order their I/O requests cleverly. it may or may not relieve the UI hangs some people have seen, but if these checks are causing trouble it should expose the next bottleneck
- optimised a small test that checks if a single tag is in the parent/sibling system, typically before adding tags to a file (and hence sometimes spammed when downloaders were working). there was a now-unneeded safety check in here that I believe was throwing off the query planner in some situations

Anonymous Board owner 06/30/2021 (Wed) 22:40:07 Id: a5f35a [Preview] No.1096 del

- the 'review threads' debug UI now has two new tabs for the job schedulers. I will be working with UI-lag-experiencing users in future to see where the biggest problems are here. I suspect part of it will overhead from downloader thread spam, which I have more plans for
- all jobs that threads schedule on main UI time are now profiled in 'callto' profile mode
- .
- site encoding fixes:
- fixed a problem with webpages that report an encoding for which there is no available decoder. This error is now caught properly, and if 'chardet' is available to provide a supported encoding, it now steps in fixes things automatically. for most users, this fixes japanese sites that report their encoding as "Windows-31J", which seems to be a synonym for Shift-JIS. the 'non-failing unicode decode' function here is also now better at not failing, ha ha, and it delivers richer error descriptions when all attempts to decode are non-successful
- fixed a problem detecting and decoding webpages with no specified encoding (which defaults to windows-1252 and/or ISO-8859-1 in some weird internet standards thing) using chardet
- if chardet is not available and all else fails, windows-1252 is now attempted as a last resort
- added chardet presence to help->about. requests needs it atm so you likely definitely have it, but I'll make it specific in requirements.txt and expand info about it in future
- .
- boring code cleanup:
- refactored the base file import job to its own file
- client import options are moved to a new submodule, and file, tag, and the future note import options are refactored to their own files
- wrote a new object to handle current import file status in a better way than the old 'toss a tuple around' method
- implemented this file import status across most of the import pipeline and cleaned up a heap of import status, hash, mime, and note handling. rarely do downloaders now inspect raw file import status directly--they just ask the import and status object what they think should happen next based on current file import options etc...
- a url file import's pre-import status urls are now tested main url first, file url second, then associable urls (previously it was pseudorandom)
- a file import's pre-import status hashes are now tested sha256 first if that is available (previously it was pseudorandom). this probably doesn't matter 99.998% of the time, but maybe hitting 'try again' on a watcher import that failed on a previous boot and also had a dodgy hash parser, it might
- misc pre-import status prediction logic cleanup, particularly when multiple urls disagree on status and 'exclude previously deleted' is _unchecked_
- when a hash gives a file pre-import status, the import note now records which hash type it was
- pulled the 'already in db but doesn't actually exist on disk' pre-import status check out of the db, fixing a long-time ugly file manager call and reducing db lock load significantly
- updated a host of hacky file import unit tests to less hacky versions with the new status object
- all scheduled jobs now print better information about themselves in debug code
next week

Next week is a 'medium size job' week. I would like to do some server work, particularly writing the 'null account' that will inherit all content ownership after a certain period, completely anonymising history and improving long-term privacy, and then see if I can whack away at some janitor workflow improvements.

Simple Release Tomorrow! Anonymous Board owner 07/07/2021 (Wed) 07:16:02 Id: 51afa1 [Preview] No.1098 del

I had an ok week. I mostly focused on a new privacy improvement for repositories like the PTR, but otherwise I fixed a few bugs and improved some downloader UI.

The release should be as normal tomorrow.

Top | Catalog | Post a reply | Magrathea | Return

Reason: Password: Global Remove media from post Scrub media (remove image from site)

No cookies?

Posting mode: Reply