Filter
Reset all

Subjects

Content Types

Countries

AID systems

API

Certificates

Data access

Data access restrictions

Database access

Database access restrictions

Database licenses

Data licenses

Data upload

Data upload restrictions

Enhanced publication

Institution responsibility type

Institution type

Keywords

Metadata standards

PID systems

Provider types

Quality management

Repository languages

Software

Syndications

Repository types

Versioning

  • * at the end of a keyword allows wildcard searches
  • " quotes can be used for searching phrases
  • + represents an AND search (default)
  • | represents an OR search
  • - represents a NOT operation
  • ( and ) implies priority
  • ~N after a word specifies the desired edit distance (fuzziness)
  • ~N after a phrase specifies the desired slop amount
Found 803 result(s)
Country
The Australian National Corpus collates and provides access to assorted examples of Australian English text, transcriptions, audio and audio-visual materials. Text analysis tools are embedded in the interface allowing analysis and downloads in *.CSV format.
The Text Laboratory provides assistance with databases, word lists, corpora and tailored solutions for language technology. We also work on research and development projects alone or in cooperation with others - locally, nationally and internationally. Services and tools: Word and frequency lists, Written corpora, Speech corpora, Multilingual corpora, Databases, Glossa Search Tool, The Oslo-Bergen Tagger, GREI grammar games, Audio files: dialects from Norway and America etc., Nordic Atlas of Language Structures (NALS) Journal, Norwegian in America, NEALT, Ethiopian Language Technology, Access to Corpora
CLARIN-UK is a consortium of centres of expertise involved in research and resource creation involving digital language data and tools. The consortium includes the national library, and academic departments and university centres in linguistics, languages, literature and computer science.
Country
The KPDL covers cultural heritage, scientific and regional collections – digital copies of different forms of publications: books, journals, graphics, articles, leaflets, posters, playbills, photographs, invitations, maps, exhibition catalogues and trade fairs of the region. The Kujawsko-Pomorska Digital Library is to serve scientists, students, schoolchildren and all the citizens of the region.
Country
National Institute of Information and Communications Technology (NICT) has taken charge of the WDC for Ionosphere. WDC for Ionosphere archives ionospheric data and metadata from approximately 250 stations across the globe.
The IMLS conducts annual surveys of public and state libraries in the US that have response rates near 100%. Data is compiled for states, library systems, and individual library branches and includes statistics for circulation, visits, staff, expenditures, and more. Data is available in two formats: MS Access and flat file, plain text. Data for museums is now included.
The Tree of Life Web Project is a collection of information about biodiversity compiled collaboratively by hundreds of expert and amateur contributors. Its goal is to contain a page with pictures, text, and other information for every species and for each group of organisms, living or extinct. Connections between Tree of Life web pages follow phylogenetic branching patterns between groups of organisms, so visitors can browse the hierarchy of life and learn about phylogeny and evolution as well as the characteristics of individual groups.
Jason is a remote-controlled deep-diving vessel that gives shipboard scientists immediate, real-time access to the sea floor. Instead of making short, expensive dives in a submarine, scientists can stay on deck and guide Jason as deep as 6,500 meters (4 miles) to explore for days on end. Jason is a type of remotely operated vehicle (ROV), a free-swimming vessel connected by a long fiberoptic tether to its research ship. The 10-km (6 mile) tether delivers power and instructions to Jason and fetches data from it.
Country
The World Data Centre section provides software and data catalogue information and data produced by IPS Radio and Space Services over the past few past decades. You can download data files, plot graphs from data files, check data availability, retrieve data sets and station information.
The focus of PolMine is on texts published by public institutions in Germany. Corpora of parliamentary protocols are at the heart of the project: Parliamentary proceedings are available for long stretches of time, cover a broad set of public policies and are in the public domain, making them a valuable text resource for political science. The project develops repositories of textual data in a sustainable fashion to suit the research needs of political science. Concerning data, the focus is on converting text issued by public institutions into a sustainable digital format (TEI/XML).
A web database is provided which can be used to calculate photon cross sections for scattering, photoelectric absorption and pair production, as well as total attenuation coefficients, for any element, compound or mixture (Z ≤ 100), at energies from 1 keV to 100 GeV.
The Linguistic Data Consortium (LDC) is an open consortium of universities, libraries, corporations and government research laboratories. It was formed in 1992 to address the critical data shortage then facing language technology research and development. Initially, LDC's primary role was as a repository and distribution point for language resources. Since that time, and with the help of its members, LDC has grown into an organization that creates and distributes a wide array of language resources. LDC also supports sponsored research programs and language-based technology evaluations by providing resources and contributing organizational expertise. LDC is hosted by the University of Pennsylvania and is a center within the University’s School of Arts and Sciences.
SWE-CLARIN is a national node in European Language and Technology Infrastructure (CLARIN) - an ESFRI initiative to build an infrastructure for e-science in the humanities and social sciences. SWE-CLARIN makes language-based materials available as research data using advanced processing tools and other resources. One basic idea is that the increasing amount of text and speech - contemporary and historical - as digital research material enables new forms of e-science and new ways to tackle old research issues.
Codex Sinaiticus is one of the most important books in the world. Handwritten well over 1600 years ago, the manuscript contains the Christian Bible in Greek, including the oldest complete copy of the New Testament. The Codex Sinaiticus Project is an international collaboration to reunite the entire manuscript in digital form and make it accessible to a global audience for the first time. Drawing on the expertise of leading scholars, conservators and curators, the Project gives everyone the opportunity to connect directly with this famous manuscript.
Country
>>>!!!<<<As stated 2017-05-23 Cancer GEnome Mine is no longer available >>>!!!<<< Cancer GEnome Mine is a public database for storing clinical information about tumor samples and microarray data, with emphasis on array comparative genomic hybridization (aCGH) and data mining of gene copy number changes.
Country
<<<<<!!! With the implementation of GlyTouCan (https://glytoucan.org/) the mission of GlycomeDB comes to an end. !!!>>>>> With the new database, GlycomeDB, it is possible to get an overview of all carbohydrate structures in the different databases and to crosslink common structures in the different databases. Scientists are now able to search for a particular structure in the meta database and get information about the occurrence of this structure in the five carbohydrate structure databases.
The Medical Expenditure Panel Survey (MEPS) is a set of large-scale surveys of families and individuals, their medical providers, and employers across the United States. MEPS is the most complete source of data on the cost and use of health care and health insurance coverage.
The USDA Agricultural Marketing Service (AMS) Cotton Program maintains a National Database (NDB) in Memphis, Tennessee for owner access to cotton classification data. The NDB is computerized telecommunications system which allows owners or authorized agents of owners to retrieve classing data from the current crop and/or the previous four crops. The NDB stores classing information from all 10 regional classing offices.
The Alvin Frame-Grabber system provides the NDSF community on-line access to Alvin's video imagery co-registered with vehicle navigation and attitude data for shipboard analysis, planning deep submergence research cruises, and synoptic review of data post-cruise. The system is built upon the methodology and technology developed for the JasonII Virtual Control Van and a prototype system that was deployed on 13 Alvin dives in the East Pacific Rise and the Galapagos (AT7-12, AT7-13). The deployed prototype system was extremely valuable in facilitating real-time dive planning, review, and shipboard analysis.
the Data Hub is a community-run catalogue of useful sets of data on the Internet. You can collect links here to data from around the web for yourself and others to use, or search for data that others have collected. Depending on the type of data (and its conditions of use), the Data Hub may also be able to store a copy of the data or host it in a database, and provide some basic visualisation tools.
4DGenome is a public database that archives and disseminates chromatin interaction data. Currently, 4DGenome contains over 8,038,247 interactions curated from both experimental studies (high throughput and individual studies) and computational predictions. It covers five organisms, Homo sapiens, Mus musculus, Drosophila melanogaster, Plasmodium falciparum, and Saccharomyces cerevisiae.
The Government is releasing public data to help people understand how government works and how policies are made. Some of this data is already available, but data.gov.uk brings it together in one searchable website. Making this data easily available means it will be easier for people to make decisions and suggestions about government policies based on detailed information.
ScholarSphere is an institutional repository managed by Penn State University Libraries. Anyone with a Penn State Access ID can deposit materials relating to the University’s teaching, learning, and research mission to ScholarSphere. All types of scholarly materials, including publications, instructional materials, creative works, and research data are accepted. ScholarSphere supports Penn State’s commitment to open access and open science. Researchers at Penn State can use ScholarSphere to satisfy open access and data availability requirements from funding agencies and publishers.