Главная Другое
Экономика Финансы Маркетинг Астрономия География Туризм Биология История Информатика Культура Математика Физика Философия Химия Банк Право Военное дело Бухгалтерия Журналистика Спорт Психология Литература Музыка Медицина |
страница 1страница 2 Министерство образования Российской Федерации МОСКОВСКИЙ ГОСУДАРСТВЕННЫЙ ИНСТИТУТ ЭЛЕКТРОНИКИ И МАТЕМАТИКИ (ТЕХНИЧЕСКИЙ УНИВЕРСИТЕТ) ОТЧЕТ О ЛАБОТАРОРНОЙ РАБОТЕ Методы и средства анализа данных по теме:
«Система анализа данных WEKA» Вариант datamining400-57 Руководитель темы ______________ И. Игнатьев подпись, дата Исполнитель ______________ П. Степуро подпись, дата Группа С-75 1. ВВЕДЕНИЕ 2. ОСНОВНАЯ ЧАСТЬ ЗАДАНИЕ 1. ПОДГОТОВКА ДАННЫХ ЗАДАНИЕ 2. КЛАССИФИКАЦИЯ NaiveBayes ID3
J4.8 SVM(SMO)
ЗАДАНИЕ 3. 3. ЗАКЛЮЧЕНИЕ 4. НАБОР ДАННЫХ
Cистема анализа данных Weka написана на Java и представляет собой систему библиотек функции обработки данных, плюс несколько графических интерфейсов к этим библиотекам. Основной интерфейс системы - Explorer. Он позволяет выполнять практически все действия, которые предусмотрены в системе. Именно в нем мы будем работать. ![]() ОСНОВНАЯ ЧАСТЬ ЗАДАНИЕ 1: ПОДГОТОВИТЬ ИСХОДНЫЙ ФАЙЛ В ФОРМАТЕ *.arff. В начале необходимо перевести таблицу, содержащую данные, в формат csv и модифицировать ее. Модификация состоит в добавлении полей метаданных: в начало файла на отдельных строчках названия зависимости @relation имя, описания атрибутов @attribute (имя , тип, ) и @data перед началом самих данных. Типы данных следующие: численные (numeric, real, integer), перечислимые(nominal) (задаются перечислением вида {i1, ..., in}), строковые (string), дата (date [date format]). Например, атрибут capital-gain тип numeric, так как это числовые данные, характеризующие заработок. Атрибуты необходимо характеризовать как можно точнее. Таким образом мы изменили исходный файл, перечислили все атрибуты и можем сохранить файл в формате *.arff. @RELATION test
@ATTRIBUTE workclass {Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked} @ATTRIBUTE fnlwgt numeric @ATTRIBUTE education {Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool} @ATTRIBUTE education-num numeric @ATTRIBUTE marital-status {Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse} @ATTRIBUTE occupation {Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces} @ATTRIBUTE relationship{Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried} @ATTRIBUTE race {White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black} @ATTRIBUTE sex {Female, Male} @ATTRIBUTE capital-gain numeric @ATTRIBUTE capital-loss numeric @ATTRIBUTE hours-per-week numeric @ATTRIBUTE native-country {United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands} @ATTRIBUTE income {>50K,<=50K}
34, Local-gov, 177675, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 55, United-States, >50K ЗАДАНИЕ 2: КЛАССИФИЦИРОВАТЬ ИСХОДНЫЕ ДАННЫЕ БАЙЕСОВСКИМ МЕТОДОМ, МЕТОДОМ J4.8, МЕТОДОМ ID3, МЕТОДОМ 1R, МЕТОДОМ SVM. ПО МЕРЕ НЕОБХОДИМОСТИ ИСПОЛЬЗОВАТЬ ФИЛЬТРЫ. ![]() При помощи кнопки Visualize All представить зависимость переменной от всех атрибутов в графическом виде. ![]() Для автоматической обработки данных используют фильтры. Фильтры делятся на два типа - те, применение которых к данным может вызвать отклонение (supervised) (то есть фактически эти фильтры требуют уже наличия каких-то знаний, полученных от примененного какого-то алгоритма обучения), и те, который можно применять к ещё необработанным данным (unsupervised).
![]() При помощи кнопки Choose выбираем метод классификации. Методов представлено много, но наиболее важны методы линейной регрессий (в разделе functions), наивной байесовской классификации (в разделе bayes), построения деревьев решений (в разделе trees) и построения правил (в разделе rules). Выбрав метод классификации, мы можем исправить значения параметров метода по умолчанию. Далее необходимо выбрать метод проверки и зависимую переменную. Основным методом является кросс-проверка (cross-validation). Можно также проводить проверку результатов анализа на обучающем множестве(training set), на специальном тестовом множестве (supplied test set) и на тестовой части обучающего множества (Percentage Split). После этого нажимается кнопка Start. По завершении анализа заполнится окно Output и добавится новая запись в окно Result. В нашем случае методом проверки является кросс-проверка. Суть ее в том что исходный набор данных разбивается на обучающее и проверочное множества. Далее по обучающему множеству данные классифицируются, а по проверочному проверяются. Таким образом и вычисляется ошибка.
Одним из действительных преимуществ данного метода является то, что пропущенные значения не создают никакой проблемы. При подсчете вероятности они просто пропускаются для всех правил, и не влияют на соотношение вероятностей. Значит можно не использовать фильтры. Анализ: (для населения с заработком больше и меньше 50000) === Run information === Scheme: weka.classifiers.bayes.NaiveBayes Relation: test Instances: 400 (общее количество) Attributes: 15 (атрибуты) age workclass fnlwgt education education-num marital-status occupation relationship race sex capital-gain capital-loss hours-per-week native-country income Test mode: 10-fold cross-validation (кросс проверка для оценки ошибки алгоритма) === Classifier model (full training set) === Naive Bayes Classifier Class >50K: Prior probability = 0.24(годовой доход больше 50к имеют 24% людей) age: Normal Distribution. Mean = 43.6962 StandardDev = 9.9212 WeightSum = 96 Precision = 1.0892857142857142 (их средний возраст 43 года 9) workclass: Discrete Estimator. Counts = 64 15 7 4 7 4 1 1 (Total = 103) (люди из рабочего класса: 64 – частников,15 работающих вне корпорации, 7 работающих на корпорацию, 4 из федерального управления, 7 из местного управления, 4 из управления штатом,1 из безработных и 1 из еще не работающих) fnlwgt: Normal Distribution. Mean = 191285.6891 StandardDev = 95275.8814 WeightSum = 96 Precision = 1835.791878172589 (люди, имеющие средний вес в стране 191285 95275) education: Discrete Estimator. Counts = 23 19 1 24 5 5 6 1 1 1 13 1 2 8 1 1 (Total = 112) (люди, с образованием: 23 бакалавра, 19 выпускников колледжа, 1 11-классник, 24 выпускник высшей школы, 5 выпускников проф.школы, 5 академиков, 6 член ассоциации по профессиональному признаку, 1 9-классника, 1 ученик с 7-8 класс, 1 12-классник, 13 магистров, 1 ученик с 1-4 класс,2 10классника,8 доктора наук, 1 ученик с 5-6 класс, 1 дошкольник) education-num: Normal Distribution. Mean = 11.6875 StandardDev = 2.3466 WeightSum = 96 Precision = 1.0 (люди, с количеством лет образования 11,7 2,3) marital-status: Discrete Estimator. Counts = 79 8 10 1 2 2 1 (Total = 103) occupation: Discrete Estimator. Counts = 7 16 2 10 19 22 2 6 10 2 6 1 5 1 (Total = 109) relationship: Discrete Estimator. Counts = 10 3 69 16 1 3 (Total = 102) race: Discrete Estimator. Counts = 92 3 2 2 2 (Total = 101) sex: Discrete Estimator. Counts = 16 82 (Total = 98) capital-gain: Normal Distribution. Mean = 1846.0461 StandardDev = 4904.5334 WeightSum = 96 Precision = 1464.6315789473683 capital-loss: Normal Distribution. Mean = 158.85 StandardDev = 573.2855 WeightSum = 96 Precision = 282.4 hours-per-week: Normal Distribution. Mean = 45.0523 StandardDev = 10.7546 WeightSum = 96 Precision = 1.9534883720930232 native-country: Discrete Estimator. Counts = 88 1 1 1 3 2 1 1 2 2 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 (Total = 135) (точно такой же анализ мы видим для населения получающего меньше 50к в) год Class <=50K: Prior probability = 0.76 age: Normal Distribution. Mean = 36.5269 StandardDev = 13.8907 WeightSum = 304 Precision = 1.0892857142857142 workclass: Discrete Estimator. Counts = 236 22 8 4 13 10 1 1 (Total = 295) fnlwgt: Normal Distribution. Mean = 197963.5834 StandardDev = 103982.0527 WeightSum = 304 Precision = 1835.791878172589 education: Discrete Estimator. Counts = 35 87 12 97 1 7 12 11 8 5 12 8 17 1 6 1 (Total = 320) education-num: Normal Distribution. Mean = 9.2928 StandardDev = 2.5822 WeightSum = 304 Precision = 1.0 marital-status: Discrete Estimator. Counts = 110 50 122 15 7 6 1 (Total = 311) occupation: Discrete Estimator. Counts = 11 48 44 30 28 22 11 30 35 10 22 4 5 1 (Total = 301) relationship: Discrete Estimator. Counts = 15 72 93 80 16 34 (Total = 310) race: Discrete Estimator. Counts = 260 9 4 3 33 (Total = 309) sex: Discrete Estimator. Counts = 113 193 (Total = 306) capital-gain: Normal Distribution. Mean = 120.4467 StandardDev = 633.8929 WeightSum = 304 Precision = 1464.6315789473683 capital-loss: Normal Distribution. Mean = 37.1579 StandardDev = 247.093 WeightSum = 304 Precision = 282.4 hours-per-week: Normal Distribution. Mean = 37.9067 StandardDev = 11.7867 WeightSum = 304 Precision = 1.9534883720930232 native-country: Discrete Estimator. Counts = 270 1 2 4 1 3 1 2 1 3 1 2 2 1 1 3 1 2 1 1 8 2 2 1 2 2 1 1 2 1 1 3 1 1 1 1 2 1 1 1 1 (Total = 339) Time taken to build model: 0.03 seconds (время затраченное на анализ) === Stratified cross-validation === === Summary === Correctly Classified Instances 329 82.25 % (правильно исследованных данных) Incorrectly Classified Instances 71 17.75 % (данные исследованные с ошибками) (проценты правильно исследованных данных определяют точность алгоритма) Kappa statistic 0.4616 Mean absolute error 0.1923 (ошибки) Root mean squared error 0.3929 Relative absolute error 52.5958 % (относительная) Root relative squared error 91.9981 % (квадратичная) Total Number of Instances 400 (общее количество) === Detailed Accuracy By Class === TP Rate FP Rate Precision Recall F-Measure Class 0.49 0.072 0.681 0.49 0.57 >50K 0.928 0.51 0.852 0.928 0.888 <=50K === Confusion Matrix === a b <-- classified as 47 49 | a = >50K 22 282 | b = <=50K КЛАССИФИКАЦИЯ ДАННЫХ МЕТОДОМ ID3. ![]() Здесь используется метод построения деревьев решений. Для классификации данных в этом методе множество объектов разбивают из обучающей выборки, относящиеся к одинаковым классам. Для построения дерева необходимо правильно выбирать независимую переменную по которой будет происходить разбиение внутренних узлов дерева. Для алгоритма ID3 необходимо выбрать такую переменную, чтобы при разбиении по ней один из классов имел наибольшую вероятность появления. Во входном наборе алгоритм требует только номинальные значения переменных, а также чтобы не было пропущенных значений. Поэтому применяем фильтры – RemoveType (для удаления атрибутов типа «numeric») и ReplaceMissingValues (для замещения отсутствующих значений средними по атрибуту). Для большей точности мы использовали метод проверки результатов анализа - training set (проверку результатов анализа на обучающем множестве). Анализ: === Run information === Scheme: weka.classifiers.trees.Id3 Relation: test-weka.filters.unsupervised.attribute.Normalize-weka.filters.unsupervised.attribute.RemoveType-Tnumeric-weka.filters.unsupervised.attribute.ReplaceMissingValues Instances: 400 Attributes: 9 workclass education marital-status occupation relationship race
sex native-country income Test mode: 10-fold cross-validation === Classifier model (full training set) === Id3 education = Bachelors | relationship = Wife | | occupation = Tech-support: null | | occupation = Craft-repair: null | | occupation = Other-service: <=50K | | occupation = Sales: null | | occupation = Exec-managerial: null | | occupation = Prof-specialty: >50K | | occupation = Handlers-cleaners: null | | occupation = Machine-op-inspct: null | | occupation = Adm-clerical: null | | occupation = Farming-fishing: null | | occupation = Transport-moving: null | | occupation = Priv-house-serv: null | | occupation = Protective-serv: null | | occupation = Armed-Forces: null | relationship = Own-child: <=50K | relationship = Husband | | occupation = Tech-support: null | | occupation = Craft-repair: >50K | | occupation = Other-service: null | | occupation = Sales: >50K | | occupation = Exec-managerial | | | native-country = United-States | | | | workclass = Private: >50K | | | | workclass = Self-emp-not-inc: >50K | | | | workclass = Self-emp-inc: null | | | | workclass = Federal-gov: null | | | | workclass = Local-gov: >50K | | | | workclass = State-gov: null | | | | workclass = Without-pay: null | | | | workclass = Never-worked: null | | | native-country = Cambodia: null | | | native-country = England: null | | | native-country = Puerto-Rico: null | | | native-country = Canada: null | | | native-country = Germany: <=50K | | | native-country = Outlying-US(Guam-USVI-etc): null | | | native-country = India: null | | | native-country = Japan: >50K | | | native-country = Greece: >50K | | | native-country = South: null | | | native-country = China: null | | | native-country = Cuba: null | | | native-country = Iran: null | | | native-country = Honduras: null | | | native-country = Philippines: null | | | native-country = Italy: null | | | native-country = Poland: null | | | native-country = Jamaica: null | | | native-country = Vietnam: null | | | native-country = Mexico: null | | | native-country = Portugal: null | | | native-country = Ireland: null | | | native-country = France: null | | | native-country = Dominican-Republic: null | | | native-country = Laos: null | | | native-country = Ecuador: null | | | native-country = Taiwan: null | | | native-country = Haiti: null | | | native-country = Columbia: null | | | native-country = Hungary: null | | | native-country = Guatemala: null | | | native-country = Nicaragua: null | | | native-country = Scotland: null | | | native-country = Thailand: null | | | native-country = Yugoslavia: null | | | native-country = El-Salvador: null | | | native-country = Trinadad&Tobago: null | | | native-country = Peru: null | | | native-country = Hong: null | | | native-country = Holand-Netherlands: null | | occupation = Prof-specialty | | | workclass = Private | | | | native-country = United-States: >50K | | | | native-country = Cambodia: null | | | | native-country = England: null | | | | native-country = Puerto-Rico: null | | | | native-country = Canada: >50K | | | | native-country = Germany: null | | | | native-country = Outlying-US(Guam-USVI-etc): null | | | | native-country = India: null | | | | native-country = Japan: null | | | | native-country = Greece: null | | | | native-country = South: null | | | | native-country = China: null | | | | native-country = Cuba: null | | | | native-country = Iran: null | | | | native-country = Honduras: null | | | | native-country = Philippines: null | | | | native-country = Italy: null | | | | native-country = Poland: null | | | | native-country = Jamaica: null | | | | native-country = Vietnam: null | | | | native-country = Mexico: null | | | | native-country = Portugal: null | | | | native-country = Ireland: null | | | | native-country = France: null | | | | native-country = Dominican-Republic: null | | | | native-country = Laos: null | | | | native-country = Ecuador: null | | | | native-country = Taiwan: null | | | | native-country = Haiti: null | | | | native-country = Columbia: null | | | | native-country = Hungary: null | | | | native-country = Guatemala: null | | | | native-country = Nicaragua: null | | | | native-country = Scotland: null | | | | native-country = Thailand: null | | | | native-country = Yugoslavia: null | | | | native-country = El-Salvador: null | | | | native-country = Trinadad&Tobago: null | | | | native-country = Peru: null | | | | native-country = Hong: null | | | | native-country = Holand-Netherlands: null | | | workclass = Self-emp-not-inc: >50K | | | workclass = Self-emp-inc: null | | | workclass = Federal-gov: null | | | workclass = Local-gov: null | | | workclass = State-gov: >50K | | | workclass = Without-pay: null | | | workclass = Never-worked: null | | occupation = Handlers-cleaners: null | | occupation = Machine-op-inspct: null | | occupation = Adm-clerical: >50K | | occupation = Farming-fishing: null | | occupation = Transport-moving: <=50K | | occupation = Priv-house-serv: null | | occupation = Protective-serv: null | | occupation = Armed-Forces: null | relationship = Not-in-family | | occupation = Tech-support: <=50K | | occupation = Craft-repair: null | | occupation = Other-service: <=50K | | occupation = Sales: null | | occupation = Exec-managerial: <=50K | | occupation = Prof-specialty | | | sex = Female: <=50K | | | sex = Male | | | | workclass = Private: <=50K | | | | workclass = Self-emp-not-inc: null | | | | workclass = Self-emp-inc: null | | | | workclass = Federal-gov: null | | | | workclass = Local-gov: null | | | | workclass = State-gov: <=50K | | | | workclass = Without-pay: null | | | | workclass = Never-worked: null | | occupation = Handlers-cleaners: null | | occupation = Machine-op-inspct: <=50K | | occupation = Adm-clerical: <=50K | | occupation = Farming-fishing: null | | occupation = Transport-moving: null | | occupation = Priv-house-serv: null | | occupation = Protective-serv: null | | occupation = Armed-Forces: null | relationship = Other-relative: <=50K | relationship = Unmarried: <=50K education = Some-college | relationship = Wife | | occupation = Tech-support: null | | occupation = Craft-repair: <=50K | | occupation = Other-service: null | | occupation = Sales: null | | occupation = Exec-managerial: null | | occupation = Prof-specialty: >50K | | occupation = Handlers-cleaners: null | | occupation = Machine-op-inspct: >50K | | occupation = Adm-clerical: <=50K | | occupation = Farming-fishing: null | | occupation = Transport-moving: null | | occupation = Priv-house-serv: null | | occupation = Protective-serv: null | | occupation = Armed-Forces: null | relationship = Own-child | | marital-status = Married-civ-spouse: null | | marital-status = Divorced | | | occupation = Tech-support: null | | | occupation = Craft-repair: <=50K | | | occupation = Other-service: null | | | occupation = Sales: null | | | occupation = Exec-managerial: null | | | occupation = Prof-specialty: null | | | occupation = Handlers-cleaners: null | | | occupation = Machine-op-inspct: null | | | occupation = Adm-clerical: >50K | | | occupation = Farming-fishing: null | | | occupation = Transport-moving: null | | | occupation = Priv-house-serv: null | | | occupation = Protective-serv: null | | | occupation = Armed-Forces: null | | marital-status = Never-married: <=50K | | marital-status = Separated: <=50K | | marital-status = Widowed: null | | marital-status = Married-spouse-absent: null | | marital-status = Married-AF-spouse: null | relationship = Husband | | occupation = Tech-support | | | workclass = Private: <=50K | | | workclass = Self-emp-not-inc: >50K | | | workclass = Self-emp-inc: null | | | workclass = Federal-gov: null | | | workclass = Local-gov: null | | | workclass = State-gov: null | | | workclass = Without-pay: null | | | workclass = Never-worked: null | | occupation = Craft-repair | | | workclass = Private: <=50K | | | workclass = Self-emp-not-inc: <=50K | | | workclass = Self-emp-inc: <=50K | | | workclass = Federal-gov: null | | | workclass = Local-gov: <=50K | | | workclass = State-gov: >50K | | | workclass = Without-pay: null | | | workclass = Never-worked: null | | occupation = Other-service: <=50K | | occupation = Sales: <=50K | | occupation = Exec-managerial | | | workclass = Private: >50K | | | workclass = Self-emp-not-inc: <=50K | | | workclass = Self-emp-inc: null | | | workclass = Federal-gov: null | | | workclass = Local-gov: null | | | workclass = State-gov: null | | | workclass = Without-pay: null | | | workclass = Never-worked: null | | occupation = Prof-specialty: null | | occupation = Handlers-cleaners: null | | occupation = Machine-op-inspct: >50K | | occupation = Adm-clerical: >50K | | occupation = Farming-fishing | | | workclass = Private: >50K | | | workclass = Self-emp-not-inc: null | | | workclass = Self-emp-inc: <=50K | | | workclass = Federal-gov: null | | | workclass = Local-gov: null | | | workclass = State-gov: null | | | workclass = Without-pay: null | | | workclass = Never-worked: null | | occupation = Transport-moving | | | workclass = Private: <=50K | | | workclass = Self-emp-not-inc: null | | | workclass = Self-emp-inc: >50K | | | workclass = Federal-gov: null | | | workclass = Local-gov: null | | | workclass = State-gov: null | | | workclass = Without-pay: null | | | workclass = Never-worked: null | | occupation = Priv-house-serv: null | | occupation = Protective-serv | | | workclass = Private: <=50K | | | workclass = Self-emp-not-inc: null | | | workclass = Self-emp-inc: null | | | workclass = Federal-gov: null | | | workclass = Local-gov: >50K | | | workclass = State-gov: null | | | workclass = Without-pay: null | | | workclass = Never-worked: null | | occupation = Armed-Forces: null | relationship = Not-in-family | | occupation = Tech-support: <=50K | | occupation = Craft-repair | | | sex = Female: >50K | | | sex = Male | | | | marital-status = Married-civ-spouse: null | | | | marital-status = Divorced: >50K | | | | marital-status = Never-married: <=50K | | | | marital-status = Separated: <=50K | | | | marital-status = Widowed: null | | | | marital-status = Married-spouse-absent: null | | | | marital-status = Married-AF-spouse: null | | occupation = Other-service: <=50K | | occupation = Sales | | | marital-status = Married-civ-spouse: null | | | marital-status = Divorced: <=50K | | | marital-status = Never-married | | | | race = White: >50K | | | | race = Asian-Pac-Islander: null | | | | race = Amer-Indian-Eskimo: null | | | | race = Other: null | | | | race = Black: <=50K | | | marital-status = Separated: null | | | marital-status = Widowed: >50K | | | marital-status = Married-spouse-absent: null | | | marital-status = Married-AF-spouse: null | | occupation = Exec-managerial: null | | occupation = Prof-specialty: <=50K | | occupation = Handlers-cleaners: null | | occupation = Machine-op-inspct: <=50K | | occupation = Adm-clerical | | | marital-status = Married-civ-spouse: null | | | marital-status = Divorced: null | | | marital-status = Never-married: <=50K | | | marital-status = Separated: null | | | marital-status = Widowed: <=50K | | | marital-status = Married-spouse-absent: >50K | | | marital-status = Married-AF-spouse: null | | occupation = Farming-fishing: <=50K | | occupation = Transport-moving: <=50K | | occupation = Priv-house-serv: null | | occupation = Protective-serv: <=50K | | occupation = Armed-Forces: null | relationship = Other-relative: <=50K | relationship = Unmarried: <=50K education = 11th: <=50K education = HS-grad | marital-status = Married-civ-spouse | | occupation = Tech-support: >50K | | occupation = Craft-repair | | | native-country = United-States | | | | workclass = Private | | | | | relationship = Wife: <=50K | | | | | relationship = Own-child: null | | | | | relationship = Husband: <=50K | | | | | relationship = Not-in-family: null | | | | | relationship = Other-relative: null | | | | | relationship = Unmarried: null | | | | workclass = Self-emp-not-inc: null | | | | workclass = Self-emp-inc: <=50K | | | | workclass = Federal-gov: null | | | | workclass = Local-gov: null | | | | workclass = State-gov: null | | | | workclass = Without-pay: null | | | | workclass = Never-worked: null | | | native-country = Cambodia: null | | | native-country = England: null | | | native-country = Puerto-Rico: null | | | native-country = Canada: null | | | native-country = Germany: null | | | native-country = Outlying-US(Guam-USVI-etc): null | | | native-country = India: null | | | native-country = Japan: null | | | native-country = Greece: null | | | native-country = South: null | | | native-country = China: null | | | native-country = Cuba: null | | | native-country = Iran: null | | | native-country = Honduras: null | | | native-country = Philippines: null | | | native-country = Italy: null | | | native-country = Poland: null | | | native-country = Jamaica: null | | | native-country = Vietnam: null | | | native-country = Mexico: >50K | | | native-country = Portugal: null | | | native-country = Ireland: null | | | native-country = France: null | | | native-country = Dominican-Republic: null | | | native-country = Laos: null | | | native-country = Ecuador: null | | | native-country = Taiwan: null | | | native-country = Haiti: null | | | native-country = Columbia: null | | | native-country = Hungary: null | | | native-country = Guatemala: null | | | native-country = Nicaragua: null | | | native-country = Scotland: null | | | native-country = Thailand: null | | | native-country = Yugoslavia: null | | | native-country = El-Salvador: null | | | native-country = Trinadad&Tobago: null | | | native-country = Peru: null | | | native-country = Hong: null | | | native-country = Holand-Netherlands: null | | occupation = Other-service | | | relationship = Wife: >50K | | | relationship = Own-child: null | | | relationship = Husband: <=50K | | | relationship = Not-in-family: null | | | relationship = Other-relative: null | | | relationship = Unmarried: null | | occupation = Sales | | | relationship = Wife: null | | | relationship = Own-child: >50K | | | relationship = Husband: <=50K | | | relationship = Not-in-family: null | | | relationship = Other-relative: null | | | relationship = Unmarried: null | | occupation = Exec-managerial | | | workclass = Private | | | | relationship = Wife: <=50K | | | | relationship = Own-child: null | | | | relationship = Husband: >50K | | | | relationship = Not-in-family: null | | | | relationship = Other-relative: null | | | | relationship = Unmarried: null | | | workclass = Self-emp-not-inc: <=50K | | | workclass = Self-emp-inc: >50K | | | workclass = Federal-gov: null | | | workclass = Local-gov: >50K | | | workclass = State-gov: null | | | workclass = Without-pay: null | | | workclass = Never-worked: null | | occupation = Prof-specialty: null | | occupation = Handlers-cleaners | | | native-country = United-States: >50K | | | native-country = Cambodia: null | | | native-country = England: null | | | native-country = Puerto-Rico: null | | | native-country = Canada: null | | | native-country = Germany: null | | | native-country = Outlying-US(Guam-USVI-etc): null | | | native-country = India: null | | | native-country = Japan: null | | | native-country = Greece: null | | | native-country = South: null | | | native-country = China: null | | | native-country = Cuba: null | | | native-country = Iran: null | | | native-country = Honduras: null | | | native-country = Philippines: null | | | native-country = Italy: null | | | native-country = Poland: null | | | native-country = Jamaica: null | | | native-country = Vietnam: null | | | native-country = Mexico: <=50K | | | native-country = Portugal: null | | | native-country = Ireland: null | | | native-country = France: null | | | native-country = Dominican-Republic: null | | | native-country = Laos: null | | | native-country = Ecuador: null | | | native-country = Taiwan: null | | | native-country = Haiti: null | | | native-country = Columbia: null | | | native-country = Hungary: null | | | native-country = Guatemala: null | | | native-country = Nicaragua: null | | | native-country = Scotland: null | | | native-country = Thailand: null | | | native-country = Yugoslavia: null | | | native-country = El-Salvador: null | | | native-country = Trinadad&Tobago: null | | | native-country = Peru: null | | | native-country = Hong: null | | | native-country = Holand-Netherlands: null | | occupation = Machine-op-inspct | | | workclass = Private: <=50K | | | workclass = Self-emp-not-inc: <=50K | | | workclass = Self-emp-inc: null | | | workclass = Federal-gov: null | | | workclass = Local-gov: null | | | workclass = State-gov: null | | | workclass = Without-pay: null | | | workclass = Never-worked: null | | occupation = Adm-clerical: >50K | | occupation = Farming-fishing: <=50K | | occupation = Transport-moving | | | relationship = Wife: <=50K | | | relationship = Own-child: null | | | relationship = Husband: <=50K | | | relationship = Not-in-family: null | | | relationship = Other-relative: null | | | relationship = Unmarried: null | | occupation = Priv-house-serv: null | | occupation = Protective-serv: >50K | | occupation = Armed-Forces: null | marital-status = Divorced | | occupation = Tech-support: null | | occupation = Craft-repair | | | workclass = Private: <=50K | | | workclass = Self-emp-not-inc: >50K | | | workclass = Self-emp-inc: null | | | workclass = Federal-gov: null | | | workclass = Local-gov: >50K | | | workclass = State-gov: null | | | workclass = Without-pay: null | | | workclass = Never-worked: null | | occupation = Other-service: <=50K | | occupation = Sales: <=50K | | occupation = Exec-managerial | | | workclass = Private: >50K | | | workclass = Self-emp-not-inc: <=50K | | | workclass = Self-emp-inc: null | | | workclass = Federal-gov: null | | | workclass = Local-gov: null | | | workclass = State-gov: null | | | workclass = Without-pay: null | | | workclass = Never-worked: null | | occupation = Prof-specialty: null | | occupation = Handlers-cleaners: null | | occupation = Machine-op-inspct: >50K | | occupation = Adm-clerical: <=50K | | occupation = Farming-fishing: null | | occupation = Transport-moving: <=50K | | occupation = Priv-house-serv: null | | occupation = Protective-serv: null | | occupation = Armed-Forces: null | marital-status = Never-married: <=50K | marital-status = Separated: <=50K | marital-status = Widowed: <=50K | marital-status = Married-spouse-absent: null | marital-status = Married-AF-spouse: null education = Prof-school: >50K education = Assoc-acdm | occupation = Tech-support: null | occupation = Craft-repair: >50K | occupation = Other-service: null | occupation = Sales: <=50K | occupation = Exec-managerial: >50K | occupation = Prof-specialty: <=50K | occupation = Handlers-cleaners: null | occupation = Machine-op-inspct: >50K | occupation = Adm-clerical | | marital-status = Married-civ-spouse: >50K | | marital-status = Divorced: null | | marital-status = Never-married: <=50K | | marital-status = Separated: null | | marital-status = Widowed: <=50K | | marital-status = Married-spouse-absent: null | | marital-status = Married-AF-spouse: null | occupation = Farming-fishing: null | occupation = Transport-moving: <=50K | occupation = Priv-house-serv: null | occupation = Protective-serv: <=50K | occupation = Armed-Forces: null education = Assoc-voc | relationship = Wife: <=50K | relationship = Own-child: <=50K | relationship = Husband | | occupation = Tech-support: >50K | | occupation = Craft-repair: <=50K | | occupation = Other-service: <=50K | | occupation = Sales: null | | occupation = Exec-managerial: >50K | | occupation = Prof-specialty: null | | occupation = Handlers-cleaners: null | | occupation = Machine-op-inspct: <=50K | | occupation = Adm-clerical: null | | occupation = Farming-fishing: null | | occupation = Transport-moving: >50K | | occupation = Priv-house-serv: null | | occupation = Protective-serv: >50K | | occupation = Armed-Forces: null | relationship = Not-in-family: <=50K | relationship = Other-relative: <=50K | relationship = Unmarried: <=50K education = 9th: <=50K education = 7th-8th: <=50K education = 12th: <=50K education = Masters | occupation = Tech-support: >50K | occupation = Craft-repair | | marital-status = Married-civ-spouse: <=50K | | marital-status = Divorced: null | | marital-status = Never-married: >50K | | marital-status = Separated: null | | marital-status = Widowed: <=50K | | marital-status = Married-spouse-absent: null | | marital-status = Married-AF-spouse: null | occupation = Other-service: null | occupation = Sales | | workclass = Private: >50K | | workclass = Self-emp-not-inc: >50K | | workclass = Self-emp-inc: null | | workclass = Federal-gov: null | | workclass = Local-gov: null | | workclass = State-gov: null | | workclass = Without-pay: null | | workclass = Never-worked: null | occupation = Exec-managerial: >50K | occupation = Prof-specialty | | relationship = Wife: >50K | | relationship = Own-child: null | | relationship = Husband | | | workclass = Private: >50K | | | workclass = Self-emp-not-inc: >50K | | | workclass = Self-emp-inc: null | | | workclass = Federal-gov: null | | | workclass = Local-gov: <=50K | | | workclass = State-gov: null | | | workclass = Without-pay: null | | | workclass = Never-worked: null | | relationship = Not-in-family: <=50K | | relationship = Other-relative: <=50K | | relationship = Unmarried: null | occupation = Handlers-cleaners: null | occupation = Machine-op-inspct: null | occupation = Adm-clerical: <=50K | occupation = Farming-fishing: null | occupation = Transport-moving: >50K | occupation = Priv-house-serv: null | occupation = Protective-serv: <=50K | occupation = Armed-Forces: null education = 1st-4th: <=50K education = 10th | occupation = Tech-support: null | occupation = Craft-repair: <=50K | occupation = Other-service: <=50K | occupation = Sales: <=50K | occupation = Exec-managerial: null | occupation = Prof-specialty: null | occupation = Handlers-cleaners: <=50K | occupation = Machine-op-inspct: <=50K | occupation = Adm-clerical: null | occupation = Farming-fishing: <=50K | occupation = Transport-moving: >50K | occupation = Priv-house-serv: null | occupation = Protective-serv: null | occupation = Armed-Forces: null education = Doctorate: >50K education = 5th-6th: <=50K education = Preschool: null
=== Summary === Correctly Classified Instances 274 68.5 % Incorrectly Classified Instances 70 17.5 % Kappa statistic 0.3739 Mean absolute error 0.2285 Root mean squared error 0.4468 Relative absolute error 74.3816 % Root relative squared error 115.472 % UnClassified Instances 56 14 % Total Number of Instances 400
0.455 0.105 0.556 0.455 0.5 >50K 0.895 0.545 0.851 0.895 0.872 <=50K
35 42 | a = >50K 28 239 | b = <=50K Результат: Жена -при роде занятий прочее обслуживание человек получает меньше 50000 -при роде занятий проф. специалист человек получает больше 50000 Владелец ребенка получает меньше 50000 Муж -ремонтник и продавец получает больше 50000 -- Родившиеся в США --- работающие на себя , частник , работающие в правительстве получают больше 50000 --родившиеся в Германии получают меньше 50000 --родившиеся в японии и греции получают больше 50000 -при роде занятий проф. Специалист -- частник ---родившийся в США,Канаде получает больше 50000
Смотрите также: Отчет о лаботарорной работе методы и средства анализа данных по теме: «Система анализа данных weka»
383.87kb.
2 стр.
Отчет о лаботарорной работе по дисциплине Методы и средства анализа данных по теме: «Система анализа данных weka»
229.16kb.
1 стр.
Отчет о лаботарорной работе методы и средства анализа данных по теме
286.73kb.
1 стр.
Место теории измерений в методах анализа данных
266.06kb.
1 стр.
Методы анализа данных Кредиты: 3 Аннотация дисциплины
17.78kb.
1 стр.
Особенности анализа многомерных данных
170.74kb.
1 стр.
Лабораторная работа №4 Методы интеллектуального анализа данных. Обнаружение логических закономерностей на основе деревьев решений
104.04kb.
1 стр.
Методы интеллектуального анализа данных и некоторые их приложения1
28.3kb.
1 стр.
Б. Нойес Привязка данных в Windows Forms Книга охватывает все аспекты привязки данных в Windows Forms. Описываются средства, обеспечивающие связь с базой данных, такие, как типизированные наборы данных и адапт
69.76kb.
1 стр.
Методология психодиагностики и обработки экспериментальных данных
45.47kb.
1 стр.
Отчет по результатам работы по программе усовершенствования базы данных по сортам растений и изложить предложения по усовершенствованию базы данных по сортам растений
712.53kb.
4 стр.
Формула специальности: Содержанием специальности 22. 00. 04 – «Социальная структура, социальные институты и процессы» 36.75kb.
1 стр.
|