Tags


Click a tag to remove it from package

Edit Species Groups of Package

Edit Parameter of Package

Edit DOI Package

Choose a project for this package

FRED
  • Contact
  • GDPR policy
  • Imprint
  • About
  • Sign Up
  • Login
  • SEARCH
  • Search and find
  • Packages
  • Map
  • By Category ...
    • Study sites
    • Sampling locations
    • Parameters
    • Sampling types
    • Species groups
    • Current DOIs

1030 Danube fish occurrence database

DOI Info:

  • DOI: 10.18728/igb-fred-1030.3
  • How to cite: Yusdiel Torres-Cambas, András Ambrus, Miklós Bán, Bálint Bánó, Anthony Basooma, Vanessa Bremerich, Florian Borgwardt, Maša Čarf, Irina Cernisencu, Gorčin Cvijanović, István Czeglédi, Sami Domisch, Tibor Erős, Zoltán Fehér, Vivien Füstös, Juergen Geist, Thomas Hein, Milica Jaćimović, Sonja C. Jähnig, Béla Kiss, Maroš Kubala, Klaudija Lebar, Borislava Kostadinova Margaritova, Matej Marusic, Paul Meulenbroek, Stoyan Dobrev Mihov, Attila Mozsár, Zoltán Müller, Christoffer Nagel, Iulian Nichersu, Dušan Nikolić, Sandi Orlic, Joachim Pander, Polona Pengal, Marina Piria, László Polyák, Bálint Preiszner, Simon Rusjan, Márton Sallai, Zoltán Sallai, Péter Sály, Andrea Samu, Brigitte Sasano, Astrid Schmidt-Kloiber, András Sevcsik, Marija Smederevac-Lalić, András Specziár, Twan Stoffers, Zoltán Szalóky, Renáta Szita, Gábor Takács, Péter Takács, Maxim Teichert, Milcho Todorov, Balázs Tóth, Theodora Trichkova, Damir Valić, Zoltán Vitál, Martin Tschikof (2025) Danube fish occurrence database. IGB Leibniz-Institute of Freshwater Ecology and Inland Fisheries. dataset. https://doi.org/10.18728/igb-fred-1030.3
  • Previous DOI version :10.18728/igb-fred-1029.2
  • Successor DOI version :10.18728/igb-fred-1035.4
  • This Data has been updated! You will be redirected to the latest version within a few seconds. Press STOP to stay on this specific version.

    DOI history

    Date DOI PackageId Note
    2025-10-2710.18728/igb-fred-1027.01027
    2025-10-2710.18728/igb-fred-1028.11028
    2025-10-2810.18728/igb-fred-1029.21029
    2025-10-2810.18728/igb-fred-1030.31030this package
    2025-11-1010.18728/igb-fred-1035.41035
    2025-11-1010.18728/igb-fred-1036.51036
    2025-11-1210.18728/igb-fred-1037.61037 latest
Title
Danube fish occurrence database
Sampling interval
Irregular Interval
Description

The present database compiles and standardize fish occurrence datasets from federal agencies, research institutes, and conservation organizations, integrating data from sources such as the Global Biodiversity Information Facility, the Joint Danube Surveys, and the European Fish Index.  It contains 133,095 occurrence records across 114 fish species, representing 29 families and 16 orders, with a temporal range from 1856 to 2024, organized into 35 columns. In total, 506,290 entries were collected and subsequently subjected to quality checks and cleaning procedures. To facilitate data collation, formatting, and quality control, an R package, danubeoccurR , was developed, which streamlined the entire process. Additional R packages, hydrographr and specleanr, were incorporated into the workflow to aid in data manipulation and geospatial analysis.

A visualization of the spatial distribution of records is available in https://geo.igb-berlin.de/layers/geonode:danube4all_fish_occurrence_records.

Taxonomic Validation: The taxonomic names of species were verified against FishBase to ensure compliance with the most up-to-date fish taxonomy.
    
Spatial Distribution Validation: The spatial distribution of species occurrences was assessed to ensure that the recorded locations were geographically plausible and consistent with known habitats for each species. To achieve this, the dataset was first compared against environmental maps to identify and flag potential environmental outliers, and subsequently cross-checked by the data provider.
    
Temporal Validation: The temporal distribution of occurrences was examined to check for inconsistencies or improbable records.
    
Data Completeness and Consistency: The dataset was examined for missing values, duplicates, and inconsistencies in key fields such as coordinates, dates, and species names. Gaps in data were identified, and missing or inconsistent records were flagged for review and potential correction.

Data Access and Format: The dataset is available in a standard tabular format (CSV) using Darwin Core-compliant terminology to ensure compatibility with biodiversity databases. Users should refer to the metadata file for a detailed description of the column names. For convenience, a custom function named split_and_save_csv() is provided in danubeoccurR to split the occurrence dataset into independent datasets.

Geospatial Considerations: Species occurrences are georeferenced based on available locality information. Users should be aware that some records may have variable spatial precision, particularly historical occurrences. It is recommended to apply spatial filtering techniques suited to the intended analysis. For example, the coordinate uncertainty provided for records sourced from GBIF can help determine whether a record is suitable for a given analysis. Additionally, the function snap_points_on_map() in danubeoccurR allows users to manually adjust occurrence points for greater precision.

Taxonomic Standardization: Users are advised to cross-check species names with updated taxonomic databases if taxonomic revisions occur after the dataset's publication.

Data Quality and Potential Limitations: While efforts were made to standardize and clean the data, users should consider potential sources of bias, including sampling effort variations, taxonomic misidentifications, or incomplete historical records. Some records have been flagged as environmental outliers based on inconsistencies between species occurrence and expected environmental conditions. These flagged records should be reviewed carefully and may require further investigation or validation before inclusion in analyses.
 

Authors: Yusdiel Torres-Cambas, András Ambrus, Miklós Bán, Bálint Bánó, Anthony Basooma, Vanessa Bremerich, Florian Borgwardt, Maša Čarf, Irina Cernisencu, Gorčin Cvijanović, István Czeglédi, Sami Domisch, Tibor Erős, Zoltán Fehér, Vivien Füstös, Juergen Geist, Thomas Hein, Milica Jaćimović, Sonja C. Jähnig, Béla Kiss, Maroš Kubala, Klaudija Lebar, Borislava Kostadinova Margaritova, Matej Marusic, Paul Meulenbroek, Stoyan Dobrev Mihov, Attila Mozsár, Zoltán Müller, Christoffer Nagel, Iulian Nichersu, Dušan Nikolić, Sandi Orlic, Joachim Pander, Polona Pengal, Marina Piria, László Polyák, Bálint Preiszner, Simon Rusjan, Márton Sallai, Zoltán Sallai, Péter Sály, Andrea Samu, Brigitte Sasano, Astrid Schmidt-Kloiber, András Sevcsik, Marija Smederevac-Lalić, András Specziár, Twan Stoffers, Zoltán Szalóky, Renáta Szita, Gábor Takács, Péter Takács, Maxim Teichert, Milcho Todorov, Balázs Tóth, Theodora Trichkova, Damir Valić, Zoltán Vitál, Martin Tschikof

Species Groups
Study site
Danube
GeoNode references

GeoNode layers

  • Danube4all Fish Occurrence Records
Contact
Yusdiel Torres-Cambas
Licence for data
All rights reserved. Please send a request to Yusdiel Torres-Cambas if you like to use this data. Mind our data policy: IGB Data Policy
Project
DANUBE4all Project Website

Data files (e.g. excel)

TitlecreatedFiletypeActions
metadata.pdf 28. Oct. 2025 17:45 datatable: .pdf Download
occurrence_records.csv 27. Oct. 2025 15:56 datatable: .csv Download

Machine Readable Metadata Files

FRED provides all metadata of this package in a maschine readable format. There is a pure XML file and one EML file in Ecological Metadata Language. Both files are published under the free CC BY 4.0 Licence.

  • Danube_fish_occurrence_database.xml
  • Danube_fish_occurrence_database.eml

You are about to leaving FRED and visting a third party website. We are not responsible for the content or availability of linked sites.

To remain on our site, click Cancel.

Parsing data File

Estimated Time:

Why does it take so much time?

While parsing a file, the database has to perform various tasks, some of them needs a lot of CPU and memory for larger files.

  • preprocessing: means automatic detection of headlines, table body, format values or csv-separators
  • copying: means read the file cell by cell and copy all elements to the database. During this format settings can be calculated (for example iso-time)
  • analyzing: check out for different data types (can be time, numeric or text)