Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 82 result(s)
In keeping with the open data policies of the U.S. Agency for International Development (USAID) and Bill & Melinda Gates Foundation, the Cereal Systems Initiative for South Asia (CSISA) has launched the CSISA Data Repository to ensure public accessibility to key data sets, including crop cut data- directly observed, crop yield estimates, on-station and on-farm research trial data and socioeconomic surveys. CSISA is a science-driven and impact-oriented regional initiative for increasing the productivity of cereal-based cropping systems in Bangladesh, India and Nepal, thus improving food security and farmers’ livelihoods. CSISA generates data that is of value and interest to a diverse audience of researchers, policymakers and the public. CSISA’s data repository is hosted on Dataverse, an open source web application developed at Harvard University to share, preserve, cite, explore and analyze research data. CSISA’s repository contains rich datasets, including on-station trial data from 2009–17 about crop and resource management practices for sustainable future cereal-based cropping systems. Collection of this data occurred during the long-term, on-station research trials conducted at the Indian Council of Agricultural Research – Research Complex for the Eastern Region in Bihar, India. The data include information on agronomic management for the sustainable intensification of cropping systems, mechanization, diversification, futuristic approaches to sustainable intensification, long-term effects of conservation agriculture practices on soil health and the pest spectrum. Additional trial data in the repository includes nutrient omission plot technique trials from Bihar, eastern Uttar Pradesh and Odisha, India, covering 2012–15, which help determine the indigenous nutrient supplying ability of the soil. This data helps develop precision nutrient management approaches that would be most effective in different types of soils. CSISA’s most popular dataset thus far includes crop cut data on maize in Odisha, India and rice in Nepal. Crop cut datasets provide ground-truthed yield estimates, as well as valuable information on relevant agronomic and socioeconomic practices affecting production practices and yield. A variety of research data on wheat systems are also available from Bangladesh and India. Additional crop cut data will also be coming online soon. Cropping system-related data and socioeconomic data are in the repository, some of which are cross-listed with a Dataverse run by the International Food Policy Research Institute. The socioeconomic datasets contain baseline information that is crucial for technology targeting, as well as to assess the adoption and performance of CSISA-supported technologies under smallholder farmers’ constrained conditions, representing the ultimate litmus test of their potential for change at scale. Other highly interesting datasets include farm composition and productive trajectory information, based on a 20-year panel dataset, and numerous wheat crop cut and maize nutrient omission trial data from across Bangladesh.
LibraData is a place for UVA researchers to share data publicly. It is UVA's local instance of Dataverse. LibraData is part of the Libra Scholarly Repository suite of services which includes works of UVA scholarship such as articles, books, theses, and data.
The Hive is the University of Utah's institutional data repository. All datasets are associated with a U of Utah current or past faculty member or student. They are also all freely available under creative commons licenses. The repository spans disciplines and has over 80 datasets at this time.
Brainlife promotes engagement and education in reproducible neuroscience. We do this by providing an online platform where users can publish code (Apps), Data, and make it "alive" by integragrate various HPC and cloud computing resources to run those Apps. Brainlife also provide mechanisms to publish all research assets associated with a scientific project (data and analyses) embedded in a cloud computing environment and referenced by a single digital-object-identifier (DOI). The platform is unique because of its focus on supporting scientific reproducibility beyond open code and open data, by providing fundamental smart mechanisms for what we refer to as “Open Services.”
The Biological and Chemical Oceanography Data Management Office (BCO-DMO) is a publicly accessible earth science data repository created to curate, publicly serve (publish), and archive digital data and information from biological, chemical and biogeochemical research conducted in coastal, marine, great lakes and laboratory environments. The BCO-DMO repository works closely with investigators funded through the NSF OCE Division’s Biological and Chemical Sections and the Division of Polar Programs Antarctic Organisms & Ecosystems. The office provides services that span the full data life cycle, from data management planning support and DOI creation, to archive with appropriate national facilities.
The University of Pittsburgh English Language Institute Corpus (PELIC) is a 4.2-million-word learner corpus of written texts. These texts were collected in an English for Academic Purposes (EAP) context over seven years in the University of Pittsburgh’s Intensive English Program, and were produced by over 1100 students with a wide range of linguistic backgrounds and proficiency levels. PELIC is longitudinal, offering greater opportunities for tracking development in a natural classroom setting.
Bioinformatics.org serves the scientific and educational needs of bioinformatic practitioners and the general public. We develop and maintain computational resources to facilitate world-wide communications and collaborations between people of all educational and professional levels. We provide and promote open access to the materials and methods required for, and derived from, research, development and education.
The Earth System Grid Federation (ESGF) is an international collaboration with a current focus on serving the World Climate Research Programme's (WCRP) Coupled Model Intercomparison Project (CMIP) and supporting climate and environmental science in general. Data is searchable and available for download at the Federated ESGF-CoG Nodes https://esgf.llnl.gov/nodes.html
FLOSSmole is a collaborative collection of free, libre, and open source software (FLOSS) data. FLOSSmole contains nearly 1 TB of data covering the period 2004 until now, about more than 500,000 different open source projects.
The Harvard Dataverse is open to all scientific data from all disciplines worldwide. It includes the world's largest collection of social science research data. It is hosting data for projects, archives, researchers, journals, organizations, and institutions.
OpenWorm aims to build the first comprehensive computational model of the Caenorhabditis elegans (C. elegans), a microscopic roundworm. With only a thousand cells, it solves basic problems such as feeding, mate-finding and predator avoidance. Despite being extremely well studied in biology, this organism still eludes a deep, principled understanding of its biology. We are using a bottom-up approach, aimed at observing the worm behaviour emerge from a simulation of data derived from scientific experiments carried out over the past decade. To do so we are incorporating the data available in the scientific community into software models. We are engineering Geppetto and Sibernetic, open-source simulation platforms, to be able to run these different models in concert. We are also forging new collaborations with universities and research institutes to collect data that fill in the gaps All the code we produce in the OpenWorm project is Open Source and available on GitHub.
Reference anatomies of the brain and corresponding atlases play a central role in experimental neuroimaging workflows and are the foundation for reporting standardized results. The choice of such references —i.e., templates— and atlases is one relevant source of methodological variability across studies, which has recently been brought to attention as an important challenge to reproducibility in neuroscience. TemplateFlow is a publicly available framework for human and nonhuman brain models. The framework combines an open database with software for access, management, and vetting, allowing scientists to distribute their resources under FAIR —findable, accessible, interoperable, reusable— principles. TemplateFlow supports a multifaceted insight into brains across species, and enables multiverse analyses testing whether results generalize across standard references, scales, and in the long term, species, thereby contributing to increasing the reliability of neuroimaging results.
Mountain Scholar is an open access repository service that collects, preserves, and provides access to digitized library collections and other scholarly and creative works from several academic entities within the state of Colorado. Colorado State University research data from the fall of 2022 and forward is available in Dryad; CSU legacy research data prior to fall 2022 is in Mountain Scholar.
The CONP portal is a web interface for the Canadian Open Neuroscience Platform (CONP) to facilitate open science in the neuroscience community. CONP simplifies global researcher access and sharing of datasets and tools. The portal internalizes the cycle of a typical research project: starting with data acquisition, followed by processing using already existing/published tools, and ultimately publication of the obtained results including a link to the original dataset. From more information on CONP, please visit https://conp.ca
Atmosphere to Electrons (A2e) is a new, multi-year, multi-stakeholder U.S. Department of Energy (DOE) research and development initiative tasked with improving wind plant performance and mitigating risk and uncertainty to achieve substantial reduction in the cost of wind energy production. The A2e strategic vision will enable a new generation of wind plant technology, in which smart wind plants are designed to achieve optimized performance stemming from more complete knowledge of the inflow wind resource and complex flow through the wind plant.
UltraViolet is part of a suite of repositories at New York University that provide a home for research materials, operated as a partnership of the Division of Libraries and NYU IT's Research and Instruction Technology. UltraViolet provides faculty, students, and researchers within our university community with a place to deposit scholarly materials for open access and long-term preservation. UltraViolet also houses some NYU Libraries collections, including proprietary data collections.
The Community Coordinated Modeling Center (CCMC) is a multi-agency partnership based at the NASA Goddard Space Flight Center in Greenbelt, Maryland and a component of the National Space Weather Program. The CCMC provides, to the international research community, access to modern space science simulations. In addition, the CCMC supports the transition to space weather operations of modern space research models.
The Digital Repository at the University of Maryland (DRUM) collects, preserves, and provides public access to the scholarly output of the university. Faculty and researchers can upload research products for rapid dissemination, global visibility and impact, and long-term preservation.
The Million Song Dataset is a freely-available collection of audio features and metadata for a million contemporary popular music tracks. The core of the dataset is the feature analysis and metadata for one million songs, provided by The Echo Nest. The dataset does not include any audio, only the derived features. Note, however, that sample audio can be fetched from services like 7digital, using code we provide.
The KNB Data Repository is an international repository intended to facilitate ecological, environmental and earth science research in the broadest senses. For scientists, the KNB Data Repository is an efficient way to share, discover, access and interpret complex ecological, environmental, earth science, and sociological data and the software used to create and manage those data. Due to rich contextual information provided with data in the KNB, scientists are able to integrate and analyze data with less effort. The data originate from a highly-distributed set of field stations, laboratories, research sites, and individual researchers. The KNB supports rich, detailed metadata to promote data discovery as well as automated and manual integration of data into new projects. The KNB supports a rich set of modern repository services, including the ability to assign Digital Object Identifiers (DOIs) so data sets can be confidently referenced in any publication, the ability to track the versions of datasets as they evolve through time, and metadata to establish the provenance relationships between source and derived data.
The Linguistic Data Consortium (LDC) is an open consortium of universities, libraries, corporations and government research laboratories. It was formed in 1992 to address the critical data shortage then facing language technology research and development. Initially, LDC's primary role was as a repository and distribution point for language resources. Since that time, and with the help of its members, LDC has grown into an organization that creates and distributes a wide array of language resources. LDC also supports sponsored research programs and language-based technology evaluations by providing resources and contributing organizational expertise. LDC is hosted by the University of Pennsylvania and is a center within the University’s School of Arts and Sciences.
This website is a portal that enables access to multi-Terabyte turbulence databases. The data reside on several nodes and disks on our database cluster computer and are stored in small 3D subcubes. Positions are indexed using a Z-curve for efficient access.