- -

Open Refine for UNLV Libraries: Glossary

Key Points

Introduction
  • OpenRefine is a powerful, free and open source tool that can be used for data cleaning.

  • OpenRefine will automatically track any steps you take in working with your data.

Working with OpenRefine
  • Removing leading and trailing whitespace from data can make for easier searching and sorting.

  • Parsing data using regular expressions, which can be simple or complex, can remove unwanted text quickly.

Faceting and filtering
  • You can use facets and filters to explore your data

  • You can use facets and filters work with a subset of data in OpenRefine

  • You can easily correct common data issues from a Facet

Scripts from OpenRefine
  • All changes are being tracked in OpenRefine, and this information can be used for scripts for future analyses or reproducing an analysis.

Exporting and Saving Data from OpenRefine
  • Cleaned data or entire projects can be exported from OpenRefine.

  • Projects can be shared with collaborators, enabling them to see, reproduce and check all data cleaning steps you performed.

Other Resources in OpenRefine
  • Other examples and resources online are good for learning more about OpenRefine

Glossary