Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 88 result(s)
PORTULAN CLARIN Research Infrastructure for the Science and Technology of Language, belonging to the Portuguese National Roadmap of Research Infrastructures of Strategic Relevance, and part of the international research infrastructure CLARIN ERIC
IBICT is providing a research data repository that takes care of long-term preservation and archiving of good practices, so that researchers can share, maintain control and get recognition for your data. The repository supports research data sharing with Quote persistent data, allowing them to be played. The Dataverse is a large open data repository of all disciplines, created by the Institute for Quantitative Social Science at Harvard University. IBICT the Dataverse repository provides a means available for free to deposit and find specific data sets stored by employees of the institutions participating in the Cariniana network.
The repository of the Hamburg Centre for Speech Corpora is used for archiving, maintenance, distribution and development of spoken language corpora. These usually consist of audio and / or video recordings, transcriptions and other data and structured metadata. The corpora treat the focus on multilingualism and are generally freely available for research and teaching. Most of the measures maintained by the HZSK corpora were created in the years 2000-2011 in the framework of the SFB 538 "Multilingualism" at the University of Hamburg. The HZSK however also strives to take linguistic data from other projects or contexts, and to provide also the scientific community for research and teaching are available, provided that they are compatible with the current focus of HZSK, ie especially spoken language and multilingualism.
The goal of the Center of Estonian Language Resources (CELR) is to create and manage an infrastructure to make the Estonian language digital resources (dictionaries, corpora – both text and speech –, various language databases) and language technology tools (software) available to everyone working with digital language materials. CELR coordinates and organises the documentation and archiving of the resources as well as develops language technology standards and draws up necessary legal contracts and licences for different types of users (public, academic, commercial, etc.). In addition to collecting language resources, a system will be launched for introducing the resources to, informing and educating the potential users. The main users of CELR are researchers from Estonian R&D institutions and Social Sciences and Humanities researchers all over the world via the CLARIN ERIC network of similar centers in Europe. Access to data is provided through different sites: Public Repository https://entu.keeleressursid.ee/public-document , Language resources https://keeleressursid.ee/en/resources/corpora, and MetaShare CELR https://metashare.ut.ee/
Country
clarin:el is the Greek national network of language resources, a nation-wide Research Infrastructure devoted to the sustainable storage, sharing, dissemination and preservation of language resources. CLARIN EL infrastructure, which is a Greek nation-wide Research Infrastructure devoted to the sustainable storage, sharing, dissemination and preservation of language resources (LRs) and aims at increasing access to and augmentation of such resources at a national scale and beyond. It is an open, integrated, secure and interoperable storage, sharing and processing infrastructure for LRs (datasets, tools and processing services) for all domains domains and disciplines where language plays a critical role, notably. CLARIN EL is implemented in the framework of the CLARIN Attiki, national project in support of ESFRI/2006 Research Infrastructures.
Country
PubData is Leuphana's institu­tional research data reposi­tory for the long-term preser­vation, documen­tation and publi­cation of research data from scienti­fic projects. PubData is main­tained by Leuphana's Media and Infor­mation Centre (MIZ) and is free of charge. The service is primarily aimed at Leuphana em­ployees and additionally at re­searchers from coope­ration partners con­tractually asso­ciated with Leuphana.
LINDAT/CLARIN is designed as a Czech “node” of Clarin ERIC (Common Language Resources and Technology Infrastructure). It also supports the goals of the META-NET language technology network. Both networks aim at collection, annotation, development and free sharing of language data and basic technologies between institutions and individuals both in science and in all types of research. The Clarin ERIC infrastructural project is more focused on humanities, while META-NET aims at the development of language technologies and applications. The data stored in the repository are already being used in scientific publications in the Czech Republic. In 2019 LINDAT/CLARIAH-CZ was established as a unification of two research infrastructures, LINDAT/CLARIN and DARIAH-CZ.
ILC-CNR for CLARIN-IT repository is a library for linguistic data and tools. Including: Text Processing and Computational Philology; Natural Language Processing and Knowledge Extraction; Resources, Standards and Infrastructures; Computational Models of Language Usage. The studies carried out within each area are highly interdisciplinary and involve different professional skills and expertises that extend across the disciplines of Linguistics, Computational Linguistics, Computer Science and Bio-Engineering.
Country
PsychArchives is a disciplinary repository for psychological science and neighboring disciplines. Accommodating 20 different digital research object (DRO) types, including articles, preprints, research data, code, supplements, preregistrations, tests and multimedia objects, PsychArchives provides a digital space that integrates all research-related content relevant to psychology. PsychArchives is committed to the FAIR principles, facilitating the findability, accessibility, interoperability and reusability of research and research data.
Country
This portal applicaton brings together the data collected and published via OGC Web-services from the individual observatories and provides access of the data to the public. Therefore, it serves as a database node to provide scientists and decision makers with reliable and well accessible data and data products.
B2FIND is a discovery service based on metadata steadily harvested from research data collections from EUDAT data centres and other repositories. The service offers faceted browsing and it allows in particular to discover data that is stored through the B2SAFE and B2SHARE services. The B2FIND service includes metadata that is harvested from many different community repositories.
Europeana is the trusted source of cultural heritage brought to you by the Europeana Foundation and a large number of European cultural institutions, projects and partners. It’s a real piece of team work. Ideas and inspiration can be found within the millions of items on Europeana. These objects include: Images - paintings, drawings, maps, photos and pictures of museum objects Texts - books, newspapers, letters, diaries and archival papers Sounds - music and spoken word from cylinders, tapes, discs and radio broadcasts Videos - films, newsreels and TV broadcasts All texts are CC BY-SA, images and media licensed individually.
The Social Science Data Archive is still active and maintained as part of the UCLA Library Data Science Center. SSDA Dataverse is one of the archiving opportunities of SSDA, the others are: Data can be archived by SSDA itself or by ICPSR or by UCLA Library or by California Digital Library. The Social Science Data Archives serves the UCLA campus as an archive of faculty and graduate student survey research. We provide long term storage of data files and documentation. We ensure that the data are useable in the future by migrating files to new operating systems. We follow government standards and archival best practices. The mission of the Social Science Data Archive has been and continues to be to provide a foundation for social science research with faculty support throughout an entire research project involving original data collection or the reuse of publicly available studies. Data Archive staff and researchers work as partners throughout all stages of the research process, beginning when a hypothesis or area of study is being developed, during grant and funding activities, while data collection and/or analysis is ongoing, and finally in long term preservation of research results. Our role is to provide a collaborative environment where the focus is on understanding the nature and scope of research approach and management of research output throughout the entire life cycle of the project. Instructional support, especially support that links research with instruction is also a mainstay of operations.
The gift of the Stowell Datasets, a digital archive of psychographic data, to the College of Liberal Arts (and continued gift of new datasets) provide a unique opportunity for WSU to facilitate access to a valuable research resource. The datasets include over 350 individual major media market surveys (CATI, Random Digit Dialing telephone surveys) collected over the period 1989-2001 and feature approximately n=1,000+ respondents for each market for each year.
The Bavarian Archive for Speech Signals (BAS) is a public institution hosted by the University of Munich. This institution was founded with the aim of making corpora of current spoken German available to both the basic research and the speech technology communities via a maximally comprehensive digital speech-signal database. The speech material will be structured in a manner allowing flexible and precise access, with acoustic-phonetic and linguistic-phonetic evaluation forming an integral part of it.
Country
NAKALA is a repository dedicated to SSH research data in France. Given its generalist and multi-disciplinary nature, all types of data are accepted, although certain formats are recommended to ensure longterm data preservation. It has been developed and is hosted by Huma-Num, the French national research infrastructure for digital humanities.
CLARINO Bergen Center repository is the repository of CLARINO, the Norwegian infrastructure project . Its goal is to implement the Norwegian part of CLARIN. The ultimate aim is to make existing and future language resources easily accessible for researchers and to bring eScience to humanities disciplines. The repository includes INESS the Norwegian Infrastructure for the Exploration of Syntax and Semantics. This infrastructure provides access to treebanks, which are databases of syntactically and semantically annotated sentences.
Polish CLARIN node – CLARIN-PL Language Technology Centre – is being built at Wrocław University of Technology. The LTC is addressed to scholars in the humanities and social sciences. Registered users are granted free access to digital language resources and advanced tools to explore them. They can also archive and share their own language data (in written, spoken, video or multimodal form).
The COordinated Molecular Probe Line Extinction Thermal Emission Survey of Star Forming Regions (COMPLETE) provides a range of data complementary to the Spitzer Legacy Program "From Molecular Cores to Planet Forming Disks" (c2d) for the Perseus, Ophiuchus and Serpens regions. In combination with the Spitzer observations, COMPLETE will allow for detailed analysis and understanding of the physics of star formation on scales from 500 A.U. to 10 pc.
The aim of CfA Library Datasets Dataverse is creating a better information system to respond to the changing needs of astronomers not only at the CfA, but worldwide as well. As part of this growing partnership with the ADS, the CfA Library is expanding its metadata and data curation services, and in the process, creating datasets that the astronomy community may find useful. The CfA Library Datasets Dataverse has been created to share these datasets with the greater community with the hope that some members may find it useful. Please remember to acknowledge the CfA Library and the ADS and cite the work using the "Data Citation" presented under each study's "Cataloging Information" section.
The World Wide Molecular Matrix (WWMM) is an electronic repository for unpublished chemical data. WWMM is an open collection of information of small molecules. The "Matrix" in WWMM is influenced by William Gibson's vision of a cyberinfrastructure where all knowledge is accessible. The WWMM is an experiment to see how far this can be taken for chemical compounds. Although much of the information for a given compound has been Openly published, very little is available in Open electronic collections. The WWMM is aimed at catalysing this approach for chemistry and the current collection is made available under the Budapest Open Archive Initiative (http://www.budapestopenaccessinitiative.org/read).
The project is set up in order to improve the infrastructure for text-based linguistic research and development by building a huge, automatically annotated German text corpus and the corresponding tools for corpus annotation and exploitation. DeReKo constitutes the largest linguistically motivated collection of contemporary German texts, contains fictional, scientific and newspaper texts, as well as several other text types, contains only licenced texts, is encoded with rich meta-textual information, is fully annotated morphosyntactically (three concurrent annotations), is continually expanded, with a focus on size and stratification of data, may be analyzed free of charge via the query system COSMAS II, serves as a 'primordial sample' from which users may draw specialized sub-samples (socalled 'virtual corpora') to represent the language domain they wish to investigate. !!! Access to data of Das Deutsche Referenzkorpus is also provided by: IDS Repository https://www.re3data.org/repository/r3d100010382 !!!
B2SAFE is a robust, safe and highly available service which allows community and departmental repositories to implement data management policies on their research data across multiple administrative domains in a trustworthy manner. A solution to: provide an abstraction layer which virtualizes large-scale data resources, guard against data loss in long-term archiving and preservation, optimize access for users from different regions, bring data closer to powerful computers for compute-intensive analysis
The Astronomy data repository at Harvard is currently open to all scientific data from astronomical institutions worldwide. Incorporating Astroinformatics of galaxies and quasars Dataverse. The Astronomy Dataverse is connected to the indexing services provided by the SAO/NASA Astrophysical Data Service (ADS).