데이터의 분포가 정규분포를 띌 때는, 표준편차를 이용한 outlier detection가 적합
데이터의 분포가 꼬리를 가지고 있거나 정규분포가 아닐 때는, Q1-1.5*IQR와 Q3+1.5*IQR 를 이용한 outlier detection이 적합
5 Ways to Detect Outliers/Anomalies That Every Data Scientist Should Know (Python Code)
Detecting Anomalies is critical to any business either by identifying faults or being proactive. This article discusses 5 different ways…
towardsdatascience.com
https://machinelearningmastery.com/how-to-use-statistics-to-identify-outliers-in-data/
How to Use Statistics to Identify Outliers in Data
When modeling, it is important to clean the data sample to ensure that the observations best represent the problem. Sometimes a dataset can contain extreme values that are outside the range of what is expected and unlike the other data. These are called ou
machinelearningmastery.com
'Machine Learning > Algorithm' 카테고리의 다른 글
Local Outlier Factors (LOF) (0) | 2019.10.07 |
---|---|
Random Forest (랜덤 포레스트) (0) | 2019.09.23 |
DBSCAN (0) | 2019.08.18 |
Singular Value Decomposition (SVD) (0) | 2019.08.18 |
PCA (Principal Component Analysis) (0) | 2019.08.16 |