Machine Learning/Algorithm

Outlier Detection (standard deviation v.s. interquartile range)

고슴군 2019. 8. 21. 15:36

 

데이터의 분포가 정규분포를 띌 때는, 표준편차를 이용한 outlier detection가 적합

데이터의 분포가 꼬리를 가지고 있거나 정규분포가 아닐 때는, Q1-1.5*IQR와 Q3+1.5*IQR 를 이용한 outlier detection이 적합

 

 

https://towardsdatascience.com/5-ways-to-detect-outliers-that-every-data-scientist-should-know-python-code-70a54335a623

 

5 Ways to Detect Outliers/Anomalies That Every Data Scientist Should Know (Python Code)

Detecting Anomalies is critical to any business either by identifying faults or being proactive. This article discusses 5 different ways…

towardsdatascience.com

 

https://machinelearningmastery.com/how-to-use-statistics-to-identify-outliers-in-data/

 

How to Use Statistics to Identify Outliers in Data

When modeling, it is important to clean the data sample to ensure that the observations best represent the problem. Sometimes a dataset can contain extreme values that are outside the range of what is expected and unlike the other data. These are called ou

machinelearningmastery.com

 

 

반응형

'Machine Learning > Algorithm' 카테고리의 다른 글

Local Outlier Factors (LOF)  (0) 2019.10.07
Random Forest (랜덤 포레스트)  (0) 2019.09.23
DBSCAN  (0) 2019.08.18
Singular Value Decomposition (SVD)  (0) 2019.08.18
PCA (Principal Component Analysis)  (0) 2019.08.16