Project Proposal
Team Project Objectives
- Select a publicly available dataset from trustworthy primary or secondary data sources (see Stack Overflow Survey dataset in H3)
- Examples of trustworthy data sources are government sources ( e.g. https://data.gov/ ), organizations specialized in polling and survey research, etc.
- Avoid sites such as Kaggle where some datasets are highly questionable
- Describe the selected dataset using the Aether Data Documentation Template (released August, 2022) and available at https://www.microsoft. com/en-us/research/project/datasheets-for-datasets/
- Consider what can be learned from the data and formulate questions that will guide your analysis
- Design the problem solving steps that answer the questions
- Implement and test your design using Python, test-driven development, version control, and project management (as described in the guidelines)
Proposal Artifact
The project proposal is written in Markdown page in a file named PROPOSAL.md and located in the team GitHub remote repository.
Organization of Project Proposal
Use the template provided below. It has headings and prompts for the proposal sections.
- Address clearly and precisely all the prompts in the template.
- Both team members contribute to writing the proposal.
- Keep the prompts in the proposal.
## {Project Proposal}
**Contributors**: {Team members' names}
**Date**: {Date proposal was created}
### Data Source
List the source of your dataset: name and link.
### Data Documentation
Both team members contribute to this section.
- Briefly describe the data set the project will investigate. See the Aether Data Documentation Template.
- Focus on overview, data collection procedures, representativeness, data quality, privacy, and distribution and access
- Divide work equally and include authorship information for content in this section that each student wrote.
### Questions
Both team members contribute to this section.
- Write FOUR questions (two for each team member) that:
- Guide your project work
- Give interesting insights into the dataset.
- For each question:
- Indicate the dataset field(s) involved in your question.
- Describe how the data in the selected fields relate to each other.
- What other characteristics of the data would provide additional insights into the question.
Proposal Writing and Submission
- Team members prepare the proposal document collaboratively.
- The document MUST clearly state who has written what content in the proposal.
- The final version of the document is in the docs directory of the remote repo.
- Team members are encouraged to practice feature-branch workflow to collaboratively work on the proposal.
- Version control is used to evidence team members’ collaboration.
- Each team member creates a PDF version of the project proposal and uploads it to Canvas.
Evaluation
- Although the proposal is not graded, by missing to submit it on time you will lose 10% of the project points.
- The content of the proposal will be used in the project report.