• 12 Posts
  • 8 Comments
Joined 1 year ago
cake
Cake day: July 13th, 2023

help-circle















  • If the files are literally duplicated (exact same bytes in the files, so matching md5sums) then maybe you could just delete the duplicates and maybe replace them with links.

    If it was only a handful of ebooks I’d consider using symlinks but with a large collection that seems daunting, unless there is a simple way to automate that?

    Automatically sorting books by category isn’t so easy. Is the metadata any good? Are there categories already? ISBN’s? Even titles and authors? It starts to be kind of a project but you could possibly import MARC records (library metadata) which have some of thatinfo in them, if you can match up the books to library records. I expect that the openlibrary.org API still works but I haven’t used it in ages.

    If there’s still no simple way to get the metadata based on the file hashes, I’ll just wait until AI becomes intelligent enough to retrieve the metadata. I’m looking for a solution that doesn’t require manual organization or spending too much time. I’m wondering if there’s a way to extract metadata based on file hashes or any other method that doesn’t involve manual work. Most of the files should have title and author metadata, but some won’t. I’m not in a rush to solve this issue, and I can still find most ebooks by their title without any organization after all.