4. Data Analysis and Collaboration using Hydroshare Workflows

In the following sections, examples are provided to gain hands-on experience discovering groups and resources of interest, joining a collaborative environment, and performing analysis on data who’s results can be written back to HydroShare to share with others in the group.

More specifically, users will learn the following:

Before starting this activity please be sure to be logged into HydroShare.

Discovering/Finding a HydroShare Group

HydroShare provides a simple search functionality to help users discover and find groups within its site. Use the Collaborate tab at the top of the HydroShare page.

/Node%20anatomy

The “Find Groups” page can then be used to find a listing of discoverable and public groups available. Using the search function users can enter keywords which may be associated with group names, purpose, descriptions and other keywords indexed by the group owner.

/Node%20anatomy

Click on the title of any listed group to view the groups landing page. Members of public groups can be seen by searching users while the members of a discoverable group can not be seen. Either way users can request group membership.

Joining a Group

Once a group of interest is found, users can request access by clicking the “Ask to join” button. If the owner of the group has not set the “auto accept” option for new requests, users may experience a wait time until their request is processed. Otherwise, access may be granted almost immediately.

About Groups

A groups landing page will provide some information about the group such as the groups title, purpose and description of the group. From the landing page, users will be able to see all resources shared with the group as well as all members of the group. It should be noted that resources within the group are not owned by all members. While they are shared, ownership is retained by the individual user which created the resource.

/Node%20anatomy

Exercise: Using Resources and Workflows Within a Group to Collaborate On Data Analysis

For this exercise, attendees who have joined the HIDSI group will use an existing shared resource containing both data and a Jupyter Notebook workflow to perform some visualization and data analysis. The purpose of this exercise is to:

HydroShare supports various web apps which workflows and models can be built in. Typically, most hydrological modeling and data analysis is done on a personal computer or on some centralized computing system. A user’s knowledge of such centralized computing system could all be barriers to the user’s research. Some examples could be:

By providing and supporting web apps such as JupyterHub and its Jupyter Notebook functionality, HydroShare enables a more flexible environment for model execution and data analysis. This gives users a preconfigured environment free of worry about cyberinfrastructure and dependencies thus lowering the forementioned barriers to research.

A Jupyter notebook contains live code, equations, visualization, and explanatory text. They can be used to implement scientific workflows and other computational tasks. HydroShare users can either create and share their own workflows using Jupyter notebooks or discover and use existing workflows implemented as Jupyter notebooks.

The code within a Jupyter notebook can execute operating and language specific commands within the hosting environment. This provides users with access to the hosting environment’s computational capabilities while eliminating the need for users to install and configure software.

While this does not relieve the user from needing to learn the programming language, operating system commands, or commands associated with the program being used for analysis, it does remove the need for users to install and have the capacity to run these programs locally.

About the Following Exercise

For this exercise attendees will be performing some visualization and data analysis on the stock performance in the United States. They will use an existing workflow to perform some common computational tasks on the type of data provided using several trading indicators of technical analysis. Attendees will then perform further analysis on the data and write their results back to HydroShare as a new separate resource. This resource can then be shared with a group for further scientific collaboration.

The data used in this workshop is daily stock data from Yahoo Finance. Yahoo Finance is an online platform that provides financial news and market data, including historical stock quotes and financial reports.

Workflow for Technical Analysis of U.S. Stock

The Jupyter Notebook extracts daily stock data from Yahoo Finance and illustrates the technical analysis of US stocks.

Preparation Step: Accessing the JupyterHub Notebook

Step 1: Install and Import Libraries

Step 2: Conduct Technical Analysis of Stock Prices

First, Choose Your Stock Ticker for Technical Analysis

AA (Alcoa) / AAPL (Apple) / AMZN (Amazon.com) / BA (Boeing Company) / BAC (Bank of America Corporation) / CAT (Caterpillar) / KO (Coca-Cola) / GE (General Electric Company) / IBM (International Business Machines Corporation) / INTC (Intel Corp) / JNJ (Johnson & Johnson) / MCD (McDonald’s Corp) / MSFT (Microsoft) / NKE (NIKE) / PFE (Pfizer) / T (AT&T) / TRV (Travelers Companies) / TSLA (Tesla) / VZ (Verizon Communications) / WMT (Walmart)

Second, Plot Relative Performance to Stock Index (S&P500 Index)

Third, Plot Bollinger Bands

Bollinger Bands are two standard deviations above and below the 20-day moving average.

Fourth, Plot Stochastic Oscillator

Stochastic Oscillator is a momentum indicator comparing the current close price with the recent price range.

Fifth, Interpret a Trading Signal from Stochastic Oscillator