What is the UGA Data Warehouse

The UGA Data Warehouse is the centralized repository of institutional data that ensures users across the university are reporting from and utilizing a common and consistent version of data. The new UGA Data Warehouse provides an environment for reporting and decision-making that is supported by timely and accurate information from official, authenticated sources.   

Data is regularly extracted from source systems and stored in the data warehouse.  The raw data are then validated, and may be reformatted, summarized, restructured, and/or enriched with data from other sources.  The result is a data warehouse that can serve as a source for report generation, interactive dashboards, ad hoc reporting queries, and visualizations. 

UGA Data Warehouse structure overview

What type of Data is in the Warehouse?
UGA Data Warehouse Structure:

The UGA Data Warehouse, when fully implemented, will become the repository for all university data assets.  Source data systems will record transactional information, while a daily snapshot of each system's data will be loaded into the warehouse for operational use and end user accessibility. This process will establish the data warehouse as the common and consistent source for all institutional data, providing end users with "a single version of the truth."

 

Archive versus Reportable:

In response to the ongoing Mainframe Decommission, scheduled for completion in 2019, there will also be Legacy data moved into the data warehouse. This data will generally fall into two categories, archived or reportable.

  • Archive data is data that has been moved from the legacy source system "as is" for the main purpose of safely retaining a final copy of source data as it existed prior to the mainframe decommissioning.  In most cases it is not in a suitable format for, nor intended to be used in develop reports
  • Reportable data is data that has been moved from the legacy source system with the purpose of being used in historical reporting.  Like "non-legacy" data, this type of data is typically validated and may be reformatted, summarized, restructured, and/or enriched with data from other sources for use in historical/trend reports, interactive dashboards, ad hoc reporting queries, and visualizations.
How does the data warehouse environment differ from FACTS and other OIR data resources?
    Historical/Official

    Historical data about the University of Georgia, such as the data available through FACTS, reflect past activities such as instruction during a recently completed academic term, research activity over the prior year, financial transactions over the prior fiscal year. These data are not subject to continued change, are considered official, and have typically been reported to external agencies and other reporting outlets. 

    Historical Data is typically:

    • A once a semester “snapshot” of data as of a point in time, according to a predetermined schedule
    • Maintained and reported by OIR and available through FACTS
    • Set up for required reporting needs to State and Federal offices, using established definitions 
    • Should be used whenever reporting to external agencies and when comparing apples to apples is most important: best example, trend analyses

     

    Operational

    Current or operational data, such as the data available in the data warehouse, are constantly changing as UGA conducts its business across missions. For example, students enroll, add, and drop classes; faculty obtain grants, publish research results, conduct outreach activities, and report course grades; administrative staff handle financial transactions, space assignments, and personnel decisions. These data and reports are not considered official, and you may see fluctuations in numbers reported on a daily basis.

    Operational Data is typically:

    • A snapshot captured on a daily schedule
    • No more than 24 hours old
    • A copy of the transactional system
    • Used whenever current status is important, such as advising, course scheduling, etc.
What other data will be available?

Over the course of the next few years, source systems across campus will have their data available in the data warehouse, making it more accessible and reportable across campus.

All data domains will be housed within the data warehouse, giving end users access to leverage multiple data sources in their reporting with ease.

To find out more about the data currently available in the data warehouse, see the timeline