After exporting a “collection” in Zotero to send to an acquaintance, I imported the exported file to check that the archive file was valid, choosing to “link” the imported files rather than importing them (i.e., storing them inside the Zotero data folder).
The import had gone smoothly and I deleted the collection again, but foolishly did not chose to delete the collection contents. This left me with a large number of duplicate items scattered around, as “orphan” items with no parents.
At this point, I should have used the Unfiled Items
pane to find those
duplicates and chosen to permanently delete them. But, as I had not noticed the
duplicate items, I forgot to do this and over time, more unfiled items filled up
with unrelated items (e.g. through saving snapshots from the web extension
without selecting a collection).
I then noticed the duplicated items and first tried the “merge” function via the
Duplicate Items
pane. This left me with no duplicate items any more, but meant
that inside the items previously affected by duplication, the notes, PDF
attachments and website snapshots were duplicated, with duplicates pointing to
the data export folder.
After a little digging, it turns out Zotero marks imported “linked” attachments and especially internally with a “link mode” of 2:
LINK_MODE_IMPORTED_FILE = 0;
LINK_MODE_IMPORTED_URL = 1;
LINK_MODE_LINKED_FILE = 2;
LINK_MODE_LINKED_URL = 3;
LINK_MODE_EMBEDDED_IMAGE = 4;
Using the Zotero Javascript API, I cleaned up the duplicates.
Go to Tools
-> Developer
-> Run JavaScript
and enter this code, uncheck
Run as async function
, click Run
. Adjust FS_LOC
to the start of the full
path of the linked imported Zotero collection, e.g. C:/
or C:\
for Windows.
const ZoteroPane = Zotero.getActiveZoteroPane();
const selectedItems = [ZoteroPane.getSelectedItems()[0]];
const attachmentsToDelete = [];
const itemsWithDuplicateNotes = [];
const itemsWithDuplicateLinks = [];
// Change this to 'C:/' for Windows or '/User' for Mac
const FS_LOC = '/home'
for (let item of selectedItems) {
if (item.isRegularItem()) {
let attachmentIDs = item.getAttachments();
let links = [];
for (let id of attachmentIDs) {
let attachment = Zotero.Items.get(id);
let url = attachment.getField('url');
// Remove "linked" attachments
if (attachment['attachmentLinkMode'] == 2 &&
attachment['attachmentPath'].startsWith(FS_LOC)) {
// Could also check if attachment with same name already exists
attachmentsToDelete.push(attachment);
attachment.erase();
attachment.saveTx();
Zotero.Fulltext.indexItems([attachment.id])
}
// Remove duplicate linked url attachments
if (attachment.getAttachmentLinkMode() == 3 && links.includes(url)) {
itemsWithDuplicateLinks.push({
f: attachment,
url: url,
});
attachment.erase();
attachment.saveTx();
Zotero.Fulltext.indexItems([attachment.id])
}
links.push(url);
}
let notes = item.getNotes();
let noteContents = []
for (let id of notes) {
let note = Zotero.Items.get(id);
let noteHTML = note.getNote();
// Remove duplicates notes
if (noteContents.includes(noteHTML)) {
itemsWithDuplicateNotes.push({
text: noteHTML,
id: item.getID(),
title: item.getDisplayTitle(),
});
note.erase();
note.saveTx();
Zotero.Fulltext.indexItems([note.id])
}
noteContents.push(noteHTML);
}
}
}
//attachmentsToDelete;
//itemsWithDuplicateNotes;
//itemsWithDuplicateLinks;
For all methods of the Zotero.Item
object, see Zotero’s source code for
item.js.
See also: Zotero file relink.