上QQ阅读APP看书,第一时间看更新
Coding
Coding or statistical coding is again a process that a data scientist will use to prepare data for analysis. In this process, both quantitative data values (such as income or years of education) and qualitative data (such as race or gender) are categorized or coded in a consistent way.
Coding is performed by a data scientist for various reasons such as follows:
- More effective for running statistical models
- Computers understand the variables
- Accountability--so the data scientist can run models blind, or without knowing what variables stand for, to reduce programming/author bias
You can imagine the process of coding as the means to transform data into a form required for a system or application.