I have begun moving my document library from Zotero to Paperless. Though I don’t have a lot of incoming documents to process and file, I would find myself holding off because Zotero makes it difficult. To be fair to the application, I’m using it for general documents and not scientific papers as it’s designed for.
Both Paperless and Zotero are document managers, an important tool in Personal knowledge management. Metadata can be added to each file added and text will be extracted for searching. This reduces the pressure on needing a file naming standard that includes date, source, document type, etc.
I’m leaving Zotero because…
When a PDF file is added to Zotero, it will read metadata and create a parent item with the file stored within it. This allows notes or even other PDF files to be added into a single bundle.
Most of my documents are scanned and the “source” metadata that Zotero looks for is never there. In such cases, it does not create a parent item but leaves the PDF in isolation. I can’t add metadata to it until I create a shell parent item around it.
Adding metadata is messy. If I add multiple documents from the same source, with the same date, I have to enter in the source and date for each of them individually. It’s slow and messy.
I have Zotero set up to use a local Webdav server so my files never leave home, but the library definitions are synced. It is good in that it lets me use Zotero on Windows and Mac, but bad because information is sent that I’d rather not have on a public server.
I’m moving to Paperless because…
The addition of tags, dates, source and document type is fast with the engine suggesting what it thinks is correct. As more documents are added, the engine is learns to recognise patterns in text and get closer to the right answers without my involvement. I don’t need to set them: I only need to check them.
Exporting from Zotero
Problematic but mostly overcome. There is no “export all files into a folder structure that matches Zotero’s collection structure.” That is poor form.
I can export all files in a collection folder as one, but only if the file is on my computer already. If it’s on the webdav server only (maybe I added it on the Mac and have not yet viewed it on Windows) the export will only save the files it can see and silently fails on those it can’t. Again, poor form.
On the webdav server, files are stored with a 6 letter named zip file. On Windows the file is accessible directly within a folder of the same name. No “download everything” button and “download on sync” did not behave as expected. The answer is to copy the zip files to a temporary folder, unzip them using 7-zip with the */ option to treat each file individually and copy the resulting files into Zotero’s storage. Restart Zotero and the files are there.
After doing this on Windows and then opening Zotero on my Macbook, it did download all files when syncing so I wonder if the Windows update triggered a change.
Folder by folder
It’s now a matter of working through each collection folder at a time and adding the files to Paperless. The controlled move will let me manage and check metadata as I go. No rush. All new documents will go right to their new home. Zotero documents will follow as I work through them.
