Repository

Choosing an appropriate repository 

Not all data repositories are equally appropriate for visual data. Choosing a repository is a complex, often difficult decision that directly affects how openness, access, and responsibility are managed throughout the research process. In visual social research, selecting a repository has ethical, legal, and epistemic consequences. Features that might seem neutral, like access controls, preview options, metadata design, licensing, or preservation policies, become key tools for managing visibility, reuse, and potential harm risks. Therefore, the question isn’t which repository is best overall, but whether a particular platform aligns well with the project’s specific needs and the repository’s constraints and features. Importantly, no single repository is suitable for all visual social research; each platform offers certain openness features while restricting others, and these limitations must be carefully weighed. 

Several criteria can help guide this choice: 

Type of repository 

Repositories can be either generic platforms or tailored to particular disciplines. Generalist repositories usually offer high visibility and standard open science features, but they often have limited support for ethical governance beyond basic configurations. In contrast, social science and qualitative archives might offer better access control, stewardship, and compliance mechanisms, even if they are less optimized for visual formats. 

Compliance with FAIR principles 

An appropriate repository should ensure findability (such as using DOIs), accessibility through metadata and specified access rules, interoperability with standard formats, and reusability supported by clear documentation and licensing. For visual data, reusability often needs extra restrictions or consent-based conditions. 

Versioning and updates 

Repositories should support version control and transparent updates, especially when metadata is refined, materials are redacted, or consent conditions change. However, versioning systems vary greatly; some depend on strict record-based structures, which can restrict detailed updates or file-level modifications within a dataset. 

Costs and storage limits 

Some repositories are free, but many require fees for uploads or long-term storage. They often have limits on file size or overall storage capacity. This can be a significant constraint for visual social research, which often involves high-resolution images or videos. Therefore, assessing storage costs and technical limitations early in the project planning process is essential. 

Who can deposit and access the data 

Researchers should confirm who can upload data and the conditions for doing so. Access models differ: some repositories are openly accessible, while others have embargoes, restricted access, or require requests. These differences are especially important for visual data featuring identifiable individuals or sensitive situations. 

Compliance with institutional and funding requirements 

Repositories should align with funder and institutional expectations, including open-access policies, data-preservation standards, and FAIR principles. Some infrastructure is specifically designed to meet national or institutional requirements, making it more suitable in certain contexts. 

Technical suitability for visual data 

Repositories need to manage complex visual datasets effectively. This involves handling large files, supporting formats like .jpg, .tiff, or .mp4, incorporating detailed metadata, and providing suitable access control. Ideally, they should also accommodate contextual and ethical documentation, including consent frameworks and usage guidelines. 

Limits and constraints of repositories 

All repositories have inherent limitations. Some enforce an “all-or-nothing” approach at the dataset level, complicating the application of different access rules to individual files. Others depend heavily on licensing as the primary governance method, which may be inadequate for sensitive visual data. Additionally, preview features pose another major challenge. In many cases, previews like thumbnails or in-browser views are enabled by default for public files. For visual data, these previews essentially act as a form of distribution and could lead to unintended exposure or identification. In some repositories, preview settings are unclear or undocumented, making it hard to evaluate potential risks beforehand. Finally, repositories differ in how they support metadata and documentation. Some rely mainly on dataset-level descriptions, while others allow item-level metadata and more granular contextualization. This distinction is especially important for visually heterogeneous datasets, where sensitivity and consent conditions may vary across individual files.