National Science Foundation: Big Data Regional Innovation Hubs: Establishing Spokes to Advance Big Data Applications (BD Spokes)Deadline: September 18, 2017
The National Network of Big Data Regional Innovation Hubs (BD Hubs) program was initiated in 2015. Four BD Hubs – Midwest, Northeast, South, and West – were established to foster multi-sector collaborations among academia, industry, and government, both nationally and internationally. The BD Hubs are serving a convening and coordinating role by bringing together a wide range of Big Data stakeholders in order to connect solution seekers with solution providers.
The Big Data Regional Innovation Hubs: Establishing Spokes to Advance Big Data Applications (BD Spokes) program extends the BD Hubs network by establishing multi-institutional and multi-sector collaborations to focus on topics of specific interest to a given region. Collaborating with BD Hubs, each BD Spoke will focus on a particular topic that requires Big Data approaches and solutions. The set of activities managed by a BD Spoke will promote progress towards solutions in the chosen topic area. The regional BD Hub Steering Committee will provide general guidance to each BD Spoke and will assist the BD Spoke in coordinating with the national BD Hub network, with other BD Spokes, and with the broader innovation ecosystem.
In 2018, the BD Spokes program will support Big Data activities in a specific topic area of interest to a corresponding regional BD Hub. The activities of a BD Spoke should address one or more of the following Big Data Innovation themes:
- Accelerating progress towards societal grand challenges relevant to regional and national priority areas. Due to the pervasiveness of Big Data in virtually all national priority areas, the BD Spokes have the opportunity to bring rapid change in application areas by facilitating the creation of interdisciplinary and multidisciplinary data-intensive teams.
- Helping to automate the Big Data lifecycle. Steps in the data lifecycle include: ingestion, validation, curation, quality assessment, anonymization, publication, active data management, and analysis (including information extraction, visualization, and annotation). Automated (or semi-automated) techniques are needed in order to keep up with the rapid data rates, large volumes, and immense heterogeneity of Big Data. Automation may also aid the reproducibility of data processing and analysis workflows.
- Enabling access to and increasing the use of important available data assets, including international data sets, where relevant. Many valuable data sets are underutilized, and results from the analysis of such data are not shared, due to a variety of actual or perceived costs. One roles for a BD Spoke is to act as a catalyst for organizing and sharing data sets and related data services. BD Spokes are expected to play an important role in supporting and promulgating open data and open source software policies within their projects to facilitate the sharing of data and outcomes of analyses.
The results from an individual BD Spoke’s activities must also contribute to the education and training missions of the corresponding BD Hub, which is a key component of the BD Hub’s activities. For example, efforts could include educating researchers on Big Data tools and techniques used and/or processes followed by the BD Spoke; engaging with or soliciting input from the public (where relevant) in order to facilitate broader impact of the work undertaken by the BD Spoke; and developing education and training modules for widely disseminating Big Data to people at all educational levels from K-12 students to lifelong learners. The West Big Data Innovation Hub (http://westbigdatahub.org/) includes New Mexico and is interested in the following thematic focus areas:
- Metro data science: Tackling challenges in transportation, housing, economic development, and more in a wide variety of landscapes
- Precision medicine: Working towards data-informed decisions that are predictive, preventative, personalized, and participatory
- Managing natural resources and hazards: Addressing opportunities in topics ranging from water resource management and agriculture to emergency preparedness, energy/climate, and sustainability
- Big data technology: Leveraging a track record of innovations in areas including storage, cloud computing, analytics, and visualization
- Data-enabled scientific discovery and learning: Engaging stakeholders across disciplines to strengthen data science education and transform scientific discovery
Proposed BD Spoke projects are expected to focus on their articulated regional challenges and opportunities. In particular, submissions addressing the following areas of emphasis are welcomed:
- Education: Support innovations in software infrastructure and the use of education and learning data sets arising from both administrative data and information collected from interactive learning systems to improve learning outcomes. Projects could also propose to develop education and/or workforce development and training programs that both broaden participation in Big Data research and development activities and enable a workforce for the 21st century. Workforce and training activities will be evaluated on their innovativeness and their ability to be replicated in new environments.
- Data intensive research in the social, behavioral, and economic sciences: Accelerate research infrastructure and frameworks that integrate and operate on data from multiple sources including administrative data; scientific instruments from large-scale surveys, brain research, large-scale simulations, etc.; digitally-authored media, including text, images, audio, and emails; and streaming data from weblogs, videos, and financial/commercial transactions.
- Data-driven research in chemistry: Encourage innovative partnerships that capitalize on the data revolution (https://www.nsf.gov/pubs/2017/nsf17036/nsf17036.jsp) and utilize discovery-based science to verify scientific predictions and insights in chemistry. This area of emphasis looks for formation of new alliances to accelerate the discovery of new chemical species with predicted properties and/or new chemical reactions using approaches such as large-scale data analysis, data architectures, or machine learning. Proposal topics must be in alignment with the core research programs within the Division of Chemistry (CHE; https://www.nsf.gov/div/index.jsp?div=che).
- Neuroscience: Engage questions and opportunities in neuroscience that leverage BD Hub resources, such as enabling large scale, integrative modeling, sharing of diverse data and resources, and other neuroscience and neurotechnology approaches that require very large-scale, complex, or diverse data. Connections to other NSF programs on neuroscience research (https://www.nsf.gov/news/special_reports/brain/) are welcome.
- Data analytics for security: Better analytics and detection of security- and privacy-related patterns, anomalies, trends, and changes in BD Spoke applications and/or regional data exchanges. Development of statistical, computational and/or interdisciplinary methods for improving BD Spoke security/privacy/trustworthiness through the management, exploration, analytics, mining, and visualization of structured or unstructured BD Spoke data from disparate sources.
- Replicability and reproducibility in data science: Facilitate robust and reliable science by improving the replicability and reproducibility of research instruments, procedures, codes, and results.
The areas emphasized above in no way preclude submission of proposals concerning other topics. Any topic approved by a coordinating BD Hub is welcome. Proposals must articulate a clear focus within a specific Big Data topic or application area, while highlighting its Big Data innovation theme. All BD Spokes must have clearly defined mission statements with goals and corresponding metrics of success. Some examples that illustrate the specificity and level of detail for missions are:
- Use a specific set of analytical tools to improve the lead time for predictions of certain critical regional indicators by a given percentage
- Given a specific set of high value data sets that were previously siloed and, therefore, usable only within a single research group or institution, make them available to a broader set of groups, or to the public at large, along with appropriate privacy and access control mechanisms
- Adapt specific Big Data technologies to automate previously tedious and manual data collection and curation processes for specific types of data in a given field of science
- For a specific genre of data, introduce new types of (automated) analytics—which were previously tedious to perform and manual in nature—that can be performed with minimal human intervention
BD Spokes are intended to convene stakeholders to augment and spawn new research efforts as opposed to directly carrying out traditional research. Potential activities for BD Spokes include, but are not limited to:
- Accelerating the creation and development of Big Data solutions relevant to its mission by convening stakeholders across sectors (e.g., academic, industry, non-profits, government, etc.) to partner in results-driven programs and projects
- Driving successful pilot programs by acting as a matchmaker among the various stakeholders
- Engaging stakeholders across the region—including solution providers and end users—to enable dialogue, share best practices, and/or set standards for data access, data formats, metadata, etc.
- Connecting critical data resources to diverse stakeholders that can best utilize them to fulfill the BD Spoke mission.
Amount: A total of $10,000,000 is anticipated. Award size is dependent upon the award type, as follows:
- SMALL Spokes: Awards with total budgets from $100,000 to $500,000 over a period of up to 3 years. These awards are intended for collaborative projects, involving multiple institutions, for establishing BD Spokes on specific topics/themes related to Big Data innovation. SMALL proposals should focus on specific topics/themes related to Big Data innovation and be consistent with the regional and national priorities identified by the BD Hubs.
- MEDIUM Spokes: Awards with total budgets (including indirect costs) from $500,001 to $1,000,000 over a period of up to 3 years. MEDIUM proposals must deliver tangible outcomes, for example: (1) explicit results from data-enabled or data-facilitated inquiry in a scientific or engineering field or other domain area; (2) a prototype or proof of concept for a technology platform, data product, data standards, or other data infrastructure; or (3) an education or workforce development program with a plan for deployment and sustainment beyond the three-year award period.
Eligibility: Proposals may be submitted by the following:
- Universities and colleges: Universities and two- and four-year colleges (including community colleges) accredited in, and having a campus located in, the U.S. acting on behalf of their faculty members
- Nonprofit, non-academic organizations: Independent museums, observatories, research labs, professional societies and similar organizations in the U.S. associated with educational or research activities
- State and local governments: State educational offices or organizations and local school districts
Applicants should have past successful experiences engaging in Big Data Innovation activities. Proposals must identify the need to be formally connected to a regional BD Hub and the reasons why accomplishing the proposed activities will not be feasible outside the BD Hubs ecosystem. It is expected that the Principal Investigators (PIs) of a BD Spoke proposal will have engaged in serious and in-depth discussions with their corresponding BD Hub PIs and Steering Committee prior to submitting their proposal. BD Spoke proposers must seek formal approval from the regional BD Hub Steering Committee in the form of a letter of collaboration.