Data incubator

#Data incubator software#

We also design and implement new parallel algorithms for large datasets independent of existing platforms. Our team can help triage your problem and adapt it for use with parallel data manipulation and machine learning platforms such as Hadoop/MapReduce, parallel SQL databases, GraphLab, SciDB, and advanced research systems such as UW's own Myria. Scripts in Python and R are not natively parallel and are difficult to apply to datasets larger than main memory. Scalable Analytics: As data sizes continue to explode, parallel methods have become critical at every step. With the data scientists and the broader eScience community.Įach project will be different, but we emphasize projects in the following categories:

#Data incubator software#

Incubator projects are not "for-hire" software jobs - each project willīe led by representatives of the applicant's team working in collaboration The program will operate out of the WRF Data Science Studio. For Winter 2016, the incubator will operate on Tuesdays and Thursdays, and the project lead should plan to be available for several hours on these days. We find that collaboration in a shared space is important for deeper technical engagement and provides opportunities for "cross-pollination" among multiple concurrent projects. To apply to the program, any faculty, research staff, or student (typically, but not exclusively, at UW) can submit a short project proposal (details below) describing the science goals, the relevant datasets, and the expected technical challenges.Įach project must include a project lead who is willing to physically co-locate with the incubator staff. Our team of data scientists can provide expertise in state-of-the-art technology and methods in large-scale data manipulation and analytics (e.g., Hadoop, GraphLab, Myria, SciDB), cloud and cluster computing, statistics and machine learning, and visualization to help researchers extract knowledge from large, complex, and noisy datasets. Projects frequently, but not exclusively, involve a non-trivial software engineering component. The goal of the Data Science Incubator is to enable new science by bringing together data scientists and domain scientists to work on focused, intensive, collaborative projects. Summer 2015 Incubator Program: Data Science for Social Good