ERDDAP's "files" system lets you browse a virtual file system and download source
data files. Hopefully, this is a familiar, easy system that you can use with your
favorite web browser or, if you prefer, from a command line program like
curl.
ERDDAP was designed around the idea that most datasets are huge, so most users
just need or want a subset of the dataset that they are interested in
(e.g., a smaller geographic area, a smaller time range, or not all of the data variables).
But we
understand that some users actually do want an entire dataset, or at least the subset
which is found in a subset of the source data files. If that's you, then the "files"
system may be for you. One advantage of the "files" system is that you can see each
file's size (in bytes) and Last Modified time (Zulu time zone),
so it is easy to see if a file has been changed.
Table of Contents
To use the "files" system, just click. On any "files" web page, you can:
- Click on a heading (Name, Last modified, Size, or Description) to sort the items
by that attribute. Clicking repeatedly on one heading toggles the sort order
(ascending or descending). Note that "Last modified" uses the Zulu time zone.
- Click on a directory name to go to that directory.
- Click on a filename to download that file.
By default, a directory listing is returned as an HTML table on a web page.
A user can request that a directory listing be returned in a different
file format by appending any of these file extensions:
.csv, .htmlTable, .itx, .json, .jsonlCSV1, .jsonlCSV, .jsonlKVP,
.mat, .nc, .nccsv, .tsv, or .xhtml. For example, instead of this web page:
https://coastwatch.pfeg.noaa.gov/erddap/files/jplMURSST41/
you can request the directory listing as a .csv file:
https://coastwatch.pfeg.noaa.gov/erddap/files/jplMURSST41/.csv
For datasets available via ERDDAP's tabledap or griddap, ERDDAP administrators
can set up ERDDAP to change a dataset's metadata and variable names on-the-fly
so that you, the user, see an improved version of the dataset's metadata. But in
"files", you will see the original metadata and variable names, so don't be surprised
if they are different! If you aren't comfortable dealing with the different metadata
and variable names, you might prefer using the dataset's Data Access Form instead.
Similarly, when you request a subset of data from one of ERDDAP's Data Access Forms,
you can specify the file type (e.g., .nc, .csv, .json, .mat) that you want to receive in
response. Naturally, the source data files available via "files" are just available in
one file type. If you aren't happy with the source file's file type, you might prefer
using the dataset's Data Access Forms instead.
Some datasets in this ERDDAP aren't available via the "files" system. Common reasons
include:
- The dataset's data doesn't come from files (e.g., it comes from a database
or Cassandra, or from a remote THREDDS, HYRAX, or GRADS data server).
- The immediate source files are .ncml files which specify how to modify the
actual data files on-the-fly.
- The ERDDAP administrator chose not to make the source data files available.
If the source files for a dataset that you want aren't available, you can email
the administrator of this ERDDAP,
dfo dot meds-sdmm dot mpo at dfo-mpo dot gc dot ca,
to request that they be made available, but there is usually a reason why they aren't
already available.
We understand that some users might prefer that ERDDAP offer files via FTP instead
of HTTP as is done by "files". Sorry. Hopefully, you'll be able to do what you need
to do with the current "files" system.
ERDDAP doesn't offer results stored in compressed (e.g., .zip or .gzip) files.
Instead, ERDDAP looks for
accept-encoding
in the HTTP GET request header sent
by the client. If a supported compression type (gzip, x-gzip, or deflate) is found
in the accept-encoding list, ERDDAP includes "content‑encoding" in the HTTP response
header and compresses the data as it transmits it.
It is up to the client program to look for
content-encoding and decompress the data accordingly.
Requesting compression is optional, but compressed responses are often 3-10 times faster,
so this is a big time savings if you are downloading lots of large files.
(Note that there is no benefit
to requesting compressed .png files since the files' contents are already compressed.)
- By default, browsers and OPeNDAP clients always request compressed data and
decompress the returned data.
- With curl, add --compressed to the command line to tell curl to
request a compressed response and automatically decompress it.
- With other client software, you have explicitly set this up.
Here is a
Java example.
Here is a
Python example
(although you should either handle deflate'd responses or not request deflate).
If you want to download a series of files from ERDDAP, you don't have to request each
file's ERDDAP URL in your browser, sitting and waiting for each file to download.
Ways to use curl:
- If you are comfortable writing computer programs (e.g., with C, Java, Python, Matlab, r),
you can write a program with a loop that imports all of the desired data files.
- If you are comfortable with command line programs (just running a program,
or using bash or tcsh scripts in Linux or Mac OS X, or batch files in Windows),
you can use curl to
save results files from ERDDAP into files on your hard drive, without using a browser
or writing a computer program.
ERDDAP+curl is amazingly powerful and allows you to
use ERDDAP in many new ways. To install curl:
- On Linux and Mac OS X, curl is probably already installed as /usr/bin/curl.
- On Windows, or if your computer doesn't have curl already, you need to
download curl
and install it. To get to a command line in Windows, click on the Windows icon and type
cmd into the search text field.
("Win32 - Generic, Win32, binary (without SSL)" worked for me in Windows 7.)
Please be kind to other ERDDAP users: run just one script or curl command at a time.
Instructions for using curl are on the
curl man page and in this
curl tutorial.
But here is a quick tutorial related to using curl with ERDDAP:
- To download and save one file, use
curl --compressed -g "erddapUrl" -o fileDir/fileName.ext
where --compressed tells curl to request a compressed response and automatically decompress it,
-g disables curl's globbing feature,
erddapUrl is any ERDDAP URL that requests a data or image file, and
-o fileDir/fileName.ext specifies the name for the file that will be created.
For example,
curl --compressed -g "https://coastwatch.pfeg.noaa.gov/erddap/files/cwwcNDBCMet/nrt/NDBC_41008_met.nc" -o ndbc/41008.nc
In curl, as in many other programs, the query part of the erddapUrl must be
percent encoded:
all characters in parameter values (the parts after '=' signs) other than A-Za-z0-9_-!.~'()*
must be encoded as %HH, where HH is the 2 digit hexadecimal value of the character,
for example,
a space becomes %20. Characters above #127 must be converted to UTF-8 bytes, then each UTF-8
byte must be percent encoded (ask a programmer for help). There are
websites that percent encode and decode for you.
If you get the URL from your browser's address text field, this may be already done.
- To download and save many files in one step, use curl with the globbing feature
enabled:
curl --compressed "erddapUrl" -o fileDir/fileName#1.ext
Since the globbing feature treats the characters [, ], {, and } as special, you must also
percent encode
them in the erddapURL as %5B, %5D, %7B, %7D, respectively.
Fortunately, these are almost never in "files" filenames.
Then, in the erddapUrl, replace a zero-padded number (for example 01) with a
range of values (for example, [01-15]), or replace a substring (for example 41008) with a list of values (for example, {41008,41009,41010}).
The #1 within the output fileName causes the current value of the range or
list to be put into the output fileName. For example,
curl --compressed "https://coastwatch.pfeg.noaa.gov/erddap/files/cwwcNDBCMet/nrt/NDBC_{41008,41009,41010}_met.nc" -o ndbc/#1.nc
For most common image and video file types, the "files" system will now display a '?' icon
to the left of the filename. If you hover over that, you will see a popup window showing
the image or an audio or video player.
Similarly, for a few audio file types (notably .mp3, .ogg, and .wav), you will see an
audio control which allows you to listen to the audio file.
These preview features will only work for certain file types, in certain browsers,
in certain operating systems.
They rely on browser features, so they are largely out of our control.
Alternatively, if you click on the link for an image, audio, or video file, a viewer
or player will open in a separate window. (If your browser asks you what you want
to do with the file, tell it to handle the media file itself (not via other software),
and tell it to
remember this choice so that it will be used automatically in the future.)
One of the main features of ERDDAP is that it allows you to download subsets
of a dataset (via the dataset's Data Access Form) in whatever file format you want
or make customizable graphs and maps via the dataset's Make A Graph web page
(so you don't have to download data files or install any graphing software).
These allow you to avoid having to work with the original source data files
in file formats that you aren't familiar with and/or don't want to work with.
If you instead choose to download and work with the original source files
offered by ERDDAP's "files" system, you have to figure out how you want
to work with the files.
Fortunately, there are lots of software tools for working with the various file types:
- All File Types
For all types of files, you can look up the file extension at sites like
FileInfo.com.
which give a basic description of the file type and list software that can
be used to work with the files (view, read, write, edit, etc.).
Or, you can use your favorite search engine to search for what you want.
- Audio Files
(for example, .3gp, .aiff, .au, .flac, .mp3, .ogg, .pcm, .wav, .wma)
The easiest way to hear most audio files offered by the "files" system is with
ERDDAP's system to view media files
in your browser,
because you don't have to download the files or install any software.
See the
Wikipedia List of Audio File Formats.
If you want to do other things with these files,
there are numerous programs to play and edit audio files including:
- HDF Files (.hdf)
HDF .hdf
is a common type of binary data file.
There are a few software packages that can work with .hdf files, including:
- NASA's Panoply
is free, commonly used software to make graphs and maps from .hdf files.
- The HDF5 library
is the official library from the HDF Group to read and write all .hdf files.
- Some analysis programs like
Matlab, and
R language
can read .hdf files via an add-on library.
- Many tools for .nc files
will also work with .hdf files.
- Image Files
(for example, .gif, .jpeg, .png, .tiff, .webp)
The easiest way to view most image files offered by the "files" system is with
ERDDAP's system to view media files
in your browser,
because you don't have to download the files or install any software.
See the
Wikipedia List of Image File Formats.
If you want to do other things with image files,
there are numerous image viewing and editing programs including:
- Paint comes with Windows (look in Window's Start list under Windows Accessories).
- A different Paint comes with Mac OS X (look in the Applications Folder).
- Gimp,
an open source program for all operating systems.
- IrfanView,
a free image editor for Windows.
- Various software from Adobe including
Photoshop.
- See the
Wikipedia Comparison Of Raster Graphics Editors.
- And many, many more. Use your favorite search engine to search for what you want.
- NetCDF Files (.nc)
NetCDF .nc
is a common type of binary data file.
There are two subcategories of .nc files: version 3 files (still widely used) and version 4 files
(which are actually .hdf files with a few changes).
Files of both versions have the extension .nc and can be read by programs that read .nc files.
There are a large number of software packages that can work with .nc files, including:
- Free, commonly used software to make graphs and maps are
NASA's Panoply
and
Ncview
(which can also be installed
via Conda).
- NCO,
a powerful command line tool to permanently modify .nc files.
- NetCDF-C and NetCDF-Java,
the main software libraries for C, C++, Fortran, or Java to read and write .nc files.
- Many analysis programs like
Ferret,
Matlab, and
R language
can read .nc files (perhaps via an add-on library), make graphs and maps, and
work with the data in .nc files.
- Numerous other software programs can read (and thus work with) .nc files. See
this list.
- Text Files (for example, .csv, .tsv, .txt)
Text files
are different than word processing files, which have special embedded
formatting commands. If you import a text file into a word processor and make changes
to the file, be sure to then save the file as an ASCII text file once again.
Or, avoid this problem by using a text editor program.
If you edit .tsv (tab separated value) files, be very careful to maintain the
tabs which separate the values in different "columns" on each row.
By default (even in many text editors), tabs often appear as spaces (or a few spaces).
So be sure to use the editor's feature that makes tabs visible (as a special symbol)
so that you can maintain the tabs between values.
.xml files are technically text files, but there are advantages to
using separate XML editors to work with them.
There are dozens of text editor programs for every operating system, including:
- Notepad comes with Windows (look in Window's Start list under Windows Accessories).
- TextEdit comes with Mac OS X (look in the Applications folder).
- Most Linux variants come with a few text editors, one of which is the default
(which you can change).
If you are using Linux, you probably already have a favorite.
- See the
Wikipedia Comparison Of Text Editors.
- And many, many more. Use your favorite search engine to search for what you want.
- Video Files
(for example, .avi, .flv, .mov, .mp4, .ogg, .ogv, .webm, .wmv)
The easiest way to view most video files offered by the "files" system is with
ERDDAP's system to view media files
in your browser,
because you don't have to download the files or install any software.
See the
Wikipedia List of Video File Formats.
If you want to do other things with video files,
there are numerous video playing and editing programs including:
- Windows Media Player comes with Windows (look in Window's Start list).
- Windows Video Editor comes with Windows (look in Window's Start list).
- QuickTime Player (a viewer) comes with Mac OS X (look in the Applications Folder).
- iMovie (an editor) comes with Mac OS X (look in the Applications Folder).
- Different Linux distributions come with different video players.
There are many open source and commercial video players and editors available for Linux.
Use your favorite search engine to search for what you want.
- See the
Wikipedia Comparison Of Video Player Software.
- See the
Wikipedia Comparison Of Video Editing Software.
- And many, many more. Use your favorite search engine to search for what you want.
- XML Files (e.g., .xml)
XML files
are structured text files. You can view them in your browser or in a text editor,
but there is also specialized software for working with XML files.
See this
Wikipedia Comparison Of XML Editors.
Unlike requests for most of the other resources in ERDDAP, a request for a file from the
"files" system (other than .nc and .hdf files)
may include a "Range" request in the header which specifies a range of bytes to
be returned, instead of the whole file.
See
Byte_serving.
This is used by some client software (for example, audio and video players in web browsers)
to request chunks of the file instead of the whole file.
Accessing a remote file via byte ranges is often slow and inefficient.
Sometimes it's worth it for reading small samples of remote files, notably audio and video files.
But the more times you need access to the file, the more efficient it is to
just download the file and then work with the local file.
ERDDAP's "files" system refuses all byte range requests to .nc and .hdf files,
so don't even try to use Netcdf-java/c, ncview, Ferret, or other software tools to connect to
.nc or .hdf files served by ERDDAP's "files" system as if they were local files.
These requests are blocked because this approach is horribly inefficient
and often causes other problems.
Instead:
- Use (OPeN)DAP client software to connect to ERDDAP's DAP services for the dataset
(which have /griddap/ or /tabledap/ in the URL). That's what DAP is for.
- Or, use the dataset's Data Access Form to request a subset of data.
- Or, if you need the entire file or repeated access over a long period of time,
use curl, wget, or your browser to download the entire file from the "files" system,
then access the data from your local copy of the file.