优点:精度高,对异常值不敏感,无数据输入假定
缺点:计算复杂度高,空间复杂度高
适用数据范围:数值型和标称型
一般流程:
(1). 收集数据(网络抓取)
(2).处理数据,将数据处理成结构化的数据格式。
(3).分析数据
(4).测试算法(主要是计算模型的出错率)
(5).使用算法,
K-近邻算法采用测量不同特征值之间的距离的方法进行分类
工作原理是:存在一个训练样本集,且样本集中每个数据都存在标签(与分类的对应关系)。
当输入没有标签的新数据后,将新数据的每个特征与样本集中的数据对应的特征进行比较,
然后算法提取训练样本集中前k个最相似的数据的分类标签,且k不大于20 。
选择最相似数据中出现次数最多的分类,作为新数据的分类。
数据归一化的作用,只用在特征数据相差较大且同等重要的条件下
aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAcIAAAA3CAYAAACfMEpbAAAKrGlDQ1BJQ0MgUHJvZmlsZQAASImVlgdUFFkWhl9VdQ6kbhCQ0OSM5Cg5NqDkKApNN6EJbdvQJDMyOIJjQEUEFUEHQRRMBBlUxIBpEFDAPCCDgDIOBjCgMgUszc7u2d2z/zu33ndu3bp169V751wAKA9ZfH4yLAFACi9NEODhzAgLj2DgBwCEDgKQA1gWO5Xv5OfnA1DNz3/XZC8ai+q+wUyuf7//XyXJiU1lAwD5oRzDSWWnoHwetVY2X5AGAIIaUMtI489wMcp0AVogyidnOH6OW2c4Zo4fzMYEBbigPAIAgcJiCeIBIH9A/Yx0djyah0JH2YjH4fJQdkXZnp3A4qCci7J+SsrqGT6NsnbMP+WJ/1vOGFFOFitexHPfMiuCKzeVn8zK+j+X438rJVk4/w5V1CgJAs8AdJZF16wqabW3iHkxy33nmcuZjZ/lBKFn8DyzU10i5pnDcvWeZ2FSsNM8swQLz3LTmEHzLFgdIMrPS17uI8ofyxRxbKpb4DzHcd2Z85ydEBQ6z+nckOXznJoU6L0Q4yLyC4QBoprjBO6ib0xJXaiNzVp4V1pCkOdCDWGiejixrm4iPy9YFM9Pcxbl5Cf7LdSf7CHyp6YHip5NQzfYPCeyvPwW8viJ1ge4Ajfggw4GOpugwxiYAU+QnhabObOngctqfpaAG5+QxnBCT00sg8ljG+ozTIyMLQGYOYNzv/j9w9mzBckQFnxJFAAsKlHnmQVfbCIALd8BEHNb8GlGodsjEoCrlmyhIH3Oh5m5YAEJiAM6erqVgBrQBgZodRbAFjiilXoBXxAEwsEqwAYJIAUIQAZYBzaDPFAAdoF9oASUgaOgCpwCZ0EjaAFXwA1wB3SCHvAE9IMh8BqMg0kwBUEQHqJCNEgOUoY0ID3IBLKC7CE3yAcKgMKhaCge4kFCaB20BSqACqESqByqhs5AF6Ar0C2oC3oEDUCj0DvoC4zAFJgOK8Ka8BLYCnaCveEgeCUcD6+Bs+FceAdcDFfAJ+EG+Ap8B+6B++HX8AQCEDIig6ggBogV4oL4IhFIHCJANiD5SBFSgdQizUg7ch/pR8aQzxgchoZhYAwwthhPTDCGjVmD2YDZjinBVGEaMNcw9zEDmHHMdywVq4DVw9pgmdgwbDw2A5uHLcJWYuux17E92CHsJA6Hk8Fp4SxxnrhwXCJuLW477hCuDteK68IN4ibweLwcXg9vh/fFs/Bp+Dz8AfxJ/GV8N34I/4lAJigTTAjuhAgCj5BDKCKcIFwidBOGCVNECaIG0YboS+QQs4g7iceIzcR7xCHiFEmSpEWyIwWREkmbScWkWtJ10lPSezKZrEq2JvuTueRN5GLyafJN8gD5M0WKoktxoURShJQdlOOUVsojynsqlapJdaRGUNOoO6jV1KvU59RPYjQxQzGmGEdso1ipWINYt9gbcaK4hriT+CrxbPEi8XPi98THJIgSmhIuEiyJDRKlEhck+iQmJGmSxpK+kimS2yVPSN6SHJHCS2lKuUlxpHKljkpdlRqkITQ1mguNTdtCO0a7Thui4+hadCY9kV5AP0XvoI9LS0mbSYdIZ0qXSl+U7pdBZDRlmDLJMjtlzsr0ynxZpLjIaVHsom2Lahd1L/oou1jWUTZWNl+2TrZH9oscQ85NLklut1yj3DN5jLyuvL98hvxh+evyY4vpi20XsxfnLz67+LECrKCrEKCwVuGowl2FCUUlRQ9FvuIBxauKY0oySo5KiUp7lS4pjSrTlO2Vucp7lS8rv2JIM5wYyYxixjXGuIqCiqeKUKVcpUNlSlVLNVg1R7VO9ZkaSc1KLU5tr1qb2ri6svoy9XXqNeqPNYgaVhoJGvs12jU+ampphmpu1WzUHNGS1WJqZWvVaD3Vpmo7aK/RrtB+oIPTsdJJ0jmk06kL65rrJuiW6t7Tg/Us9Lh6h/S69LH61vo8/Qr9PgOKgZNBukGNwYChjKGPYY5ho+GbJepLIpbsXtK+5LuRuVGy0TGjJ8ZSxl7GOcbNxu9MdE3YJqUmD0yppu6mG02bTN+a6ZnFmh02e2hOM19mvtW8zfybhaWFwKLWYtRS3TLa8qBlnxXdys9qu9VNa6y1s/VG6xbrzzYWNmk2Z23+tDWwTbI9YTuyVGtp7NJjSwftVO1YduV2/fYM+2j7I/b9DioOLIcKhxeOao4cx0rHYScdp0Snk05vnI2cBc71zh9dbFzWu7S6Iq4ervmuHW5SbsFuJW7P3VXd491r3Mc9zD3WerR6Yj29PXd79jEVmWxmNXPcy9Jrvdc1b4p3oHeJ9wsfXR+BT/MyeJnXsj3Lni7XWM5b3ugLfJm+e3yf+Wn5rfH7xR/n7+df6v8ywDhgXUB7IC0wKvBE4GSQc9DOoCfB2sHC4LYQ8ZDIkOqQj6GuoYWh/WFLwtaH3QmXD+eGN0XgI0IiKiMmVrit2LdiKNI8Mi+yd6XWysyVt1bJr0pedTFKPIoVdS4aGx0afSL6K8uXVcGaiGHGHIwZZ7uw97Nfcxw5ezmjsXaxhbHDcXZxhXEj8Xbxe+JHExwSihLGuC7cEu7bRM/EssSPSb5Jx5Omk0OT61IIKdEpF3hSvCTetdVKqzNXd/H1+Hn8/jU2a/atGRd4CypTodSVqU1pdLTZuSvUFv4gHEi3Ty9N/5QRknEuUzKTl3k3SzdrW9Zwtnv2z2sxa9lr29aprNu8bmC90/ryDdCGmA1tG9U25m4c2uSxqWozaXPS5l9zjHIKcz5sCd3SnKuYuyl38AePH2ryxPIEeX1bbbeW/Yj5kftjxzbTbQe2fc/n5N8uMCooKvi6nb399k/GPxX/NL0jbkfHToudh3fhdvF29e522F1VKFmYXTi4Z9mehr2Mvfl7P+yL2neryKyobD9pv3B/f7FPcdMB9QO7DnwtSSjpKXUurTuocHDbwY+HOIe6Dzseri1TLCso+3KEe+RhuUd5Q4VmRdFR3NH0oy+PhRxr/9nq5+pK+cqCym/Hecf7qwKqrlVbVlefUDixswauEdaMnow82XnK9VRTrUFteZ1MXcFpcFp4+tWZ6DO9Z73Ptp2zOld7XuP8wXpafX4D1JDVMN6Y0NjfFN7UdcHrQluzbXP9L4a/HG9RaSm9KH1x5yXSpdxL05ezL0+08lvHrsRfGWyLantyNezqg2v+1zque1+/ecP9xtV2p/bLN+1uttyyuXXhttXtxjsWdxrumt+t/9X81/oOi46Ge5b3mjqtO5u7lnZd6nbovnLf9f6NB8wHd3qW93T1Bvc+7Ivs63/IeTjyKPnR28fpj6eebHqKfZr/TOJZ0XOF5xW/6fxW12/Rf3HAdeDui8AXTwbZg69/T/3961DuS+rLomHl4eoRk5GWUffRzlcrXg295r+eGsv7Q/KPg2+035z/0/HPu+Nh40NvBW+n321/L/f++AezD20TfhPPJ1Mmpz7mf5L7VPXZ6nP7l9Avw1MZX/Ffi7/pfGv+7v396XTK9DSfJWDNtgIIanBcHADvjgNADQeA1gkASWyuR54VNNfXzxL4TzzXR8/KAoBKRwCCWwHw2wRAOWqaqFFQnx9qQY4ANjUV2T+UGmdqMpeL3Ii2JkXT0+/R3hCvA8C3vunpqcbp6W9or4M8BqB1cq43n5EE2v8fOWLsERDcASHgX/UX2G4FvI6Bn/sAAAGcaVRYdFhNTDpjb20uYWRvYmUueG1wAAAAAAA8eDp4bXBtZXRhIHhtbG5zOng9ImFkb2JlOm5zOm1ldGEvIiB4OnhtcHRrPSJYTVAgQ29yZSA1LjQuMCI+CiAgIDxyZGY6UkRGIHhtbG5zOnJkZj0iaHR0cDovL3d3dy53My5vcmcvMTk5OS8wMi8yMi1yZGYtc3ludGF4LW5zIyI+CiAgICAgIDxyZGY6RGVzY3JpcHRpb24gcmRmOmFib3V0PSIiCiAgICAgICAgICAgIHhtbG5zOmV4aWY9Imh0dHA6Ly9ucy5hZG9iZS5jb20vZXhpZi8xLjAvIj4KICAgICAgICAgPGV4aWY6UGl4ZWxYRGltZW5zaW9uPjQ1MDwvZXhpZjpQaXhlbFhEaW1lbnNpb24+CiAgICAgICAgIDxleGlmOlBpeGVsWURpbWVuc2lvbj41NTwvZXhpZjpQaXhlbFlEaW1lbnNpb24+CiAgICAgIDwvcmRmOkRlc2NyaXB0aW9uPgogICA8L3JkZjpSREY+CjwveDp4bXBtZXRhPgql5SaEAAAZ9UlEQVR4Ae1dD1RU15n/dc9oJzmEGneSECJGDSUJGjAnNsTGMEkxiCfO0azTXZFNWDT+Weuf00CNxoVWXaLdQlvFtYFkQ9WDpCeYVTEHQyQRk5VQSIM2kuOciViIwBYaWZyDE5h29rvvvZl5M/NmeMAMM2Tu9eDcd+937/2+3733++7f975lJwfuOAIcAY4AR4AjEKEI/F2Eys3F5ghwBDgCHAGOgIAAN4S8IXAEOAIcAY5ARCPADWFEVz8XniPAEeAIcAS4IeRtgCPAEeAIcAQiGgFuCCO6+rnwHAGOAEeAI8ANIW8DHAGOAEeAIxDRCHBDGNHVz4XnCHAEOAIcAW4IeRvgCHAEOAIcgYhGgBvCiK5+LjxHgCPAEeAIcEPI2wBHgCPAEeAIRDQC3BBGdPVz4TkCHAGOAEeAG0LeBjgCHAGOAEcgohHghjCiq58LzxHgCHAEOALcEPI2wBHgCHAEOAIRjQA3hBFd/Vx4jgBHgCPAEeCGkLcBjgBHgCPAEYhoBLghjOjq58JzBDgCHAGOADeEvA1wBDgCHAGOQEQjwA1hRFc/F54jwBHgCHAEuCHkbYAjwBHgCHAEIhoBTbhJ31q5GRtKL4YbW5wfjgBHgCPAEVBEIB5Fp17HvCjFyAkRGEaG0IbTP12IvfWzsDg7B9O1kxUB7G//Iy582Ycpd8/BQzNvV6Tp/aIJV/96L+Yl3KkYzwLHjeZv1/Hpx58BxO/DPvjF3/6Mpo+v4d7vPYw7J/lgeRxp+ts/xR++BJKJn9t98DNu+KmpK45x8Ns6xzgiMW4mvTTdn17CICyDU3FXGFkSHxrUb/C37OT8UoxXpKUZK5fkAVn7cHRN8niVOmHKsVr6YIUWU6K0E4ZnzmikIWBFX58VGm0UorQTXDNGWtWpkPebrIPCprW2ffAWOqkytmXMVlElEURiNeHAxrWoMksyxxtRfmAjZnJ7GEGNIPxF7Ws9huc3lKBfYjUlpwiF2fMQNgom/CEMXw4jQAeFyWGZXrzzWiMQvw4L4njXkfeIc3vWonbuLlRVHcG2rBTAXIUdR1rlJNzPEQgtAjYTCsgILi86gqryPTDER6OxPA9vmSyh5YuXHhAEIkEHhYUhtJrqUUVDyfTMpzCB91sD0ujcMrG2oupTA97YmAqdLg4Za7YjKxroNH9By6TccQTCAwFLSy0GjPuQPS8OupnzkfvvGwTGLDdt4cEg52L0CESIDgqL6dfvqw9RRSVh2eMxI6gwGywWZg40iAqXfTMb2yPpwQ3cjjhdAEy6dhYKDv8IOicqWpIViI67m3YLw8vZrBZYbRpoqS78NSqRjmpNEwWtHyFCRRdcVG2wsjar0ZLsflES2rZGJd3wfUBtX1FL545S1Jx/xq/nTnEFTrkb8a6n8PNRP7URtv5qwCfTY0nrM1P1EWr7hb8crVbbMO1PlnoC6SAZ1yP2jqotjLgUfwnYiKO6H9GGTCQqKUZbNyp3rEBpxzqcOpopzBgtpjq8nLcbFx0bEtEpyC/KR1pCAIyPP14V42xoa6jB28dPoLpR3MgzbCtHbkYUbN11ePHFSkB3q1fKgYFe3LP437DzmSGszNiCR/LLkJuW4EGnhU6mX2BrRwttpBoeC52a6W6uxIq8UuQcPIXsxChYOxrwy517UGt2Vgb0OXn4SXaqx+zegroDu7G7ipbAJZdkzMcrG9PChM7BVXB+TXWv4We7K4R9cFZCdHw6CvZuxTydvAta0VD5S+wprXXutTG6vB2bkTrTvW2r7QOBoms9tBIb3v0+yso2wqubaae41aHlsw9hRgp2zJY33uDgqpirgs4Q6Gy9aDhxCNtLPqHj/kdHdtx/LGkVmRxpoNr+4zvfvrYG/FfhdnwytwhHN85zEVovTCgd5GI8gD52ajSUrutMoV2v19urLt9UYKPdXmLQU3yh/YoUffPKSYFer99qb7lBSa632LdSer3eYD/pIFLIKShBN6/YS1azstnfVnvNpS63YppKMqU4B4377+rfXhLob0gybTra4pbe86GlxGDXF5yxD3lGjNNzV32JIE9hzRWhxKH2GrtBkJ3kEurJJV9myXkZVzftNQXEO9FulWRsObpVeDYU1NhdNR8qOhmrQfBeOSnKmrm1yF5SJPoZFnp9kd3VYobsZwpFjMQ4F5Z6/Wr7+R4XY2r7QGDpblAdMp6kfudix8PXZS8k2UqaZAx7UAT30Vtn2O037OePinpGxHa1qDtUMTKWtKoKUEGktl8oZ3Wz/by9MNPVnlaXeeuZiaKDlCUceyjGnsVYcrhh/y0zJIYSu3e3YZXPKi9TpgR6JMOot5c1XXcWfL2pjOhYPmV2V6gzOjiem5fsBYIy09szC6q8+af41RS/uuyMvb2nx97V1SX89VzvstcUZgr8nmx3mbSe+iIhrMyHAuk5T0ZIESe14g3Zr9SftNc0uVSv2pSMbkgy1nIDV1/I6qfA3tIlmrKbPZfsRZksjP2RspEsXM95UTZ9prx+rtvLpEFEiVSXoaIbCQ4jppXaQeEZcfDA0g91nbFnShg1scEccz1nBNwKqlqkgcFN+6UaCTfWjqRBExGq7AOBpmNMdtmL2ICH6tG7v7L4IaHPrj4qDvBYyIjd9cv2k0dr7F2urjGCLJR0BktOWNaft1++fMa+ydE2HbgPm/tY0g6buSoCtf3CV2ZD1C/rWy7b68s2CW1MyRCytMHXQb44DH14SA/L2Do+QjmtJqaseUa2DyZOd7vP/Sddrgf0uYWYL22SWdvEQzUUCr1sT2LK3IUUQq6/AnVt43GMxIpjWzeA2ANis3Bw53Iv/q3tF4jNbXh1TRrtF+oQExMj/OmmaGD+jNY3o42YLzshq0tdh3VJQEXeL9DqIYLVdAyr9nyNN09uFMuxdqOt14OI8eLPWS9hR0ExDn78pT8qH3Hd2LelmOL0KFw/X6SxteJwLZB7JB/JMeKatlaXiNz9+aDzPIIbEs5KsOW+auE5JXMhXItlU7BwqVBrqHrrQzr8Eyo6idkg/VjbP8c9WXvwctpMZwmamEewKJY9DjjDTO/REnpSLvKXJ0v7v1okZuRiV7pACAwOCbRq+0Cg6URGY/CjvevotFYFdh0zOXl3eJpfW4+qafvwemaiEGTpMKF7hM209cTPUFx6EF+OMB0rUElniLwRlqnzkZCQgvkj3lUYS1qx9LH9r7Zf+C5FQ/0yNTkBjz7+iG8iigm6DvJbemgjg2YI+0zNaG7r8ytdy+nDFB+PHz7lUhJCAlsbflPAlGc6XjC44q58LJgeSvIAYuVbK5o78IDUwGvOXfZbZiAiLa3/jZKLYk5ZLz6LKXRQxGKxQH5GTpuQidd3ZnhtyNtoT62K7GD88ic8jGcUnt1ISgaN+OXvXNcjbN2nkbG2BDMy56LtXB3q6K941Vqc+7O8NHVSsZ3KqG/7eFWMnyw6qn8D2sYlu/4CZjpxvw3G3CI8LTPmQhZRd2OGPC/rFbwnYZV43x3yGNx9/wPic+MJXO4NEV2QT/hrE5Zj5xpp8OCQvu9zvEttAEkGxEtbf9HfNWLPT572ai+6e8RRoMNkqu0DgaZzsK5NNGATDdgulhzABRl2pmMvIK/CjMX3D6Cujtrp6UNY+1wFbjrbiyOH4X5ZK70TI26lPnTGcKWFfbza/iOrC18yDa8xgq+DfPEW6vARN1M1DLceewUbSmi6QPcCT70uHnDxSkcNt7qCtAHNmua6nwOA5dI5YbYVa1iAOHnCQfEhPiXRbXOe1DuSn6DeaSaNK9HIkwXWb8UHh0qdWVbkGVHhfIqGcVsRNmYkOEM8Pab33xOCFqfe7xkF7axH6IgBmcLyGnRnJyKG7mftXrFXoLtYSoeDHCloFnqcDqqMj+vD+yfYACQWaXQ83uk0dJ3DIHt2RNgGpHnOg5jGWHSO7OMxe7o7z9rYROF0odmRVvgNFZ0bE0F7sPVewL7N24VDM7teyXTOkGPmZSBGodShQdEEzrn/LjF2hH1AbV9RS8f62kN6aqUXG/FOYzeS02LAVizWloi1WFKw3SlF0qYy2cDJGRwUj0+dEZTSQpGp2n4xNt7CUweNTSY1qQNrCOlk1bHdq1BST9MH5syVaOz+IdJivIvpa3pHMHY5/7DAaxTc3vKJkDxuhlw12NB+5aoQrvTfkKR+B752jJ2VqAIRRkfM/+LKR79uG9LumYRrH1WitNaMqr1r8fWtbyI3Vc67g74X7x9j5syAx5VeDaOJxdx4MoTmavxP24+wfGYCdp4960g8xt9JEM6uTh7hWNv6J3wo6Lg4xLjbMUV+ev9QTycGaXyTaxAUu62nHVclSt8lD+BGp0q6ayrpVOY3wAy1CrkkEUb/QycZD+14EeWNbCoouoK1xSg/nOvHWHTjTLWAJoyPsvaktg/cQPuXV6VSvH9cfUUtnXufin2QLX02ora2CS+mGcBmvWfPLvcuaBQhmsmslVJ53irDb27KOsNvkgkRqbb/BKwdB1UHhS/kI2xu/gVpO/ELlFxLR34usLu4ioj7UfmuCWk0u3F3VtRVsvh0LEr21kIDN8SOl/hgrCyZTeb39g5JafC1uJfiTUGTk94OtN+wqVh2GYLmtljlu4BWusIgGAY2mT2CnRnSrChVTws6z2N3bSeqX63G6tQ1ztG+gxdhWZTGCLFZesXRv4OO/X7VzzS00n0SOZVvf29HG0hUyVE1/6VJmFHG9rajreM20qnOSFI6tyEuTqese6ySEoxPROxw7NBVmF+wJe34HPzY4HtW7OCKKTyHih3CX13BXj4XnU0lndr8hoQFbaVuYEVHW5fbcrcXW1IAtRbEzowbxp5qMXfZeuQ/1YPGqkPidZPOahRWLMbrXv1DzNhU+SthSTpn34+RIGAvqzMFZlx94GuFWFeQi461Md/OReejT3V8JUz4h2sWyiXY0NvRIWuj1AxvAT6pYwPFWLRfbsNU2mx2Skws3BZLF/ajlOqKWpKizlAuebxCA6Jv/DLr6he+27HfDPxGjlUH+c08zCKVW9UomZy5/OdwDAyHmmvpsEs/zOXH0UEd3W0Rrff3OETtPWndMgWD0IdPz5sFDgbF0xYSN1pMnzUDqKeECrOa6L9neymduDX6OxK9548FR1Y9hwoyRKqcoQhnc+d5k9LSn2NCOGu6/OsXGqQsWwTUlhMbjTBb1njdU3Isixr0c7zzDWCI1VQJ41rX8q08687qvchh269uzojjZzd6GW5GYvnTZ8IMz2mx3NLJH6yo/ukGmiek4OCvs515ae6YLuwZXmTzUc/WpvmOsE9KtYapsTNV0d1+jzo6tfl9x8fFdsuFI3hui2vRWy6pkn/PqbOY7z2mc5FqpiB5fqrwnJaxFAsOrEcBvUDW/G4T+qh/uA4RiUmsbdVYW0porjuI7GRHrNo+oMN03QwVfeUOlXS++pR3lboEHs5nwZvP5YANh71dJ4q35HgFG4qOI3eeAwt5tC+dIacZb39g9I3a/uOrHY+31BO1PE/VFDA5FvwTLZXUk1FALd6/sJk6s0tLmN6rorki7ac97TlTZMVH4f45NBPs7CR758HeZJE9cyt7xViibL5kxeWWq2Kkz/81eHhNDiZ/NRmTpXx8kQ4ODuKu781Qjo6agbnEntm1wuWki5o+W2HPyxHtWBZNxwJxeO+ICPxv9HeRk0WyRskEHWxDZXktjT7SkTl/pqtMknVw6mwZlq4o5tPeNUs4BdrPVqx8OhsaDqxCcSN9l+z4z+G+femYTVyEud2KZJnstq4vXPueCBWdslCaux4mDKmtyDFUJCX8MA3fHdG0SIPUf1mN2Cpxr9AzW1v3OazKKUZ8VhF+Lp3AdNKo7QOBpnMyIPP4bRMyOkWvFo/l5uA2j/7Y2VCJahrrpmdlQv4OgUHLIGZP8wWyH52hWPZ4BAZI36juF+Mh0ze3DA9LEzhBoxIX0U5YOdjko/ztRmQlp0kTgm5UsQ/vJm3Co2wS5+U00N0j9rBr7dch16qzkucTNaX93AR2HpXtmoiuD6bP+wXvE4/e5wj0+NViniEbCnM8D7rhHqNwL5vekiFs/aLHjT9rz5/E2RPNcG71QNa5LGp8yn12PFxxo4jXxsxD9hoPSem6Qx0zhI8sQ6ancvVThmaKjpZ8aZHbfA3XaZ1KaWWq+dB6bKeh/a43X4V8wN5ragVm3SccWb9Ik/zLHVRrCbJao+P1got/AvfpQkTnGp+5oaCIoRvFGB8cs2F6FZ+beu9rxvoVBbAYdtHnyGR1SHuMF65ocL/KPjBrkrq+opZOUVpa13YuXSoS+AtU7o8mNJAhHMCyHHrTlEcf8p2bb53hO02wY5TlG3GpWpX9wkc7HnF5EZogaNcnmJlazC7GMVdfiRaL6LVceJfmiIAxU++uAMRo4f+p02YJv5+Z/1cWSrOThMek+4LsMIlrf8PW1iTspdCuHVJnB7tFaDH/GYPAV+OJc5DEEp5t/V+J/CY9jVlu2g1wLIsu+sFDbjK5P1hhETbNYjHnXqUlIHfqET3ZxBnXgHQfTXVa7VQ8SHs1wBV8KRdWyqD12EvIo8ugOfv2I1XH3lVJ7xyldxn20md5jGu34ao1CguM4n1BdrjCpThpFvmOuEabspS9ji1UdKqRGCWhDX297FuS7q7v0/eF2XD60sdd/cDSipeW5cFMp63f2PJ9wpHaA/3Z6N7oMXrNYH7tNdV9QG1fUUsn5942cEN4jP3+bK8lXTndaPw26ZTskCdgw2TmS2e4J7PiL0L/cg91PFk7mlF5qBINite+/Kd15BH4X7X9gpVsQfPpSlRWNwgTBU9erP8n1ptnuPtzEHWQe0Fh9RREQ0iTpaeXSperzaj+qEMQvPFttlyagme+p/MJREzKM8ISY+f5T90rVDMTm/ZkCelKdhxBt+DrRvmOYsFn2PWvfk7g+SxuxBG61GxkxVIyczl2H2oWlTt9s2t/vrintGnjUy7lJuTuWBYlQ+31okZZ8Zar+IBmmohfhIcCbAdlpYzQG4eFhnhKY8YnZjYPdznTaXZNRnx3aPkWI55cmIElS5YgI2MhjPRZHqRvEa7GxGVsEvFqLMarDWKt9TaU01Iq5UUvFvixdFc0VHQuiQLvs7ZWYJlxGTKefAHHmjuorbBBQjWeZ1PolE3YQNcPBEft55WVbI+VnLkUxoULCUfCk/4WZqxASWM0NhjpW51q+0Cg6UQuhf/NH7cIv4ueeEAWGlqvT50hY8va1ozzrH9RW/7wj72yGOa14Hcv5aG0vBTbf1XnNXDxn9YjqwA/qu0XFhp85u0tRWnxdtSYPEcSFlz44LzAmfmD38NTeifLYamDnNwFzaN68WFUHOhSkE2TQnb5vP7wR7A8mYzKenZq8of+DVZUMp43xKKgugI1rZnIlG066eavQfm2fuTsrcCKJ6vJ0PbTP7qQsI1eWq14ZWFUnA+TSIc1h48AL9LhG/ru2kJm2wUXj9yDe2HwMHY2xxtx0tP8ym16V/w4cc6mpcOcQHSUNz6/yc8+j+iKAlQdrsML85aLRp5OiB7Yy+b2vt06Y4q0HE54vVGGfvaB4e0rUBtLtdZJtRZvQBl9ZFgyBZRRqOh8yzDmmFumSlmYUZL3HGh4IDjjtoNYn5HoPD/U+rv/QC1ryL5cUjYWSNeQ1PaBQNMJrJHBfot9JZpOBi+V7fv7Ynt04QPOnTHV6f3oDGbkjr20kgYTLoCrthtRFUubN87rK1pMm0NLH6xd3j7FWS/q0qrmcpSE6vqFduo0cT+fSrn9FpdqZ/c8/5FeyuGUnt4MZHyyAun0cYCXM2TnBShduOqgUQKnOtm32FveVFOPgrD3XDGMwltiopGScica6QsN246cQYbnG0k88yZFuzljAy6yy+NHFa4i9HXjyp/76SrEJNxCx6pjlDavPPMMwjO7pvAVrfdNorPfd8fFeMwEpQJtFvRZbPT1HXpLv8eSqZOlvgasXEaHJ9LzceZlx36qM3bsHjbjWJWHbz+/j76M4d741WTeWrkZG2hvN4tO7q2RbwSqSeyksaG77Qp1SKo1wiuO8HJ1VycReUJFJ+chgH5rH7p7buAmu7Lir52MsEibyj4QSLrmAyuRV2VB/ptvK94PHqEIXuSm6leQd+LbKDqQK10Z8SLxHTCMzvCd0BFjRW+3BVExOuV+7CAL2e/w/cJq6YXFFkVfrfGlaPwwH2wd5KfoUEcF3RCCGudLZNDEBTQSl0aSx193Ha/3B4CltRJLNpQiKWcf9mcn+yOd4HG9OLTSiPLOdBw58zKGGyOERlgaVW9eQrP7JOw7tR9BmwyERjheqgoE+ppfw7I8NpM4QjMJtwtRKlKPD0nk6IxA4zkRdFCgZXblF9Q9QqEYbSKW0TKnw6Vn/kD1BntUYiaqinJwsXwLXnDsxTky+sb8duM1wQjqcfBUuBpBBnYUlu9/EzlJF7FlyWY097qOvXxjqoIL4hOB3oYDghHUbzoYtkaQMR8ZOsNnNY0yYqLooFGKpyJZ8GeExARbo2YvjqY7Ezh4er/yB3j9MGvrvoCaS5OwOM21p+KHfGJF2TpwuqYNjxlSVQ8QQiugFRfqzgLJC5Hs9lHZ0HLFSw8uAt3Np2GKfowOe4XNKS6/An+jdYZfyUcROeF00ChkHCbJuBhC0BmlA0/S5nT6Lpx9WXy7xjB88WiOAEeAI8AR4AiMCwLjZAjp+IOF7lLRa6Z8HhYZF3F5IRwBjgBHgCPAEXBHYNwMoXux/IkjwBHgCHAEOALhgUDwD8uEh5ycC44AR4AjwBHgCCgiwA2hIiw8kCPAEeAIcAQiBQFuCCOlprmcHAGOAEeAI6CIADeEirDwQI4AR4AjwBGIFAS4IYyUmuZycgQ4AhwBjoAiAtwQKsLCAzkCHAGOAEcgUhD4f9EKwSrZaJ/pAAAAAElFTkSuQmCC" alt="" name="en-media:image/png:2bdbd1d41ecee133d740c53250e97a8b:none:none" />
上面方程中数字差值最大的属性对计算结果的影响最大,仅仅是因为飞行常客里程数远大于其他特征值。然而我们认为这三种特征同样重要,因此作为三个等权重的特征
直接上代码:
from numpy import *
import matplotlib.pyplot as plot
import operator
from os import listdir
def classify0(inX, dataSet, labels, k):
dataSetSize = dataSet.shape[0]
# 距离计算公式
diffMat = tile(inX, (dataSetSize,1)) - dataSet
sqDiffMat = diffMat**2
sqDistances = sqDiffMat.sum(axis=1)
distances = sqDistances**0.5
# 距离从大到小排序,返回距离的序号
sortedDistIndicies = distances.argsort()
# 声明一个空的字典,用于存放标签
classCount={}
for i in range(k):
# sortedDistIndicies[0]返回的是距离最小的数据样本的序号
# labels[sortedDistIndicies[0]]距离最小的数据样本的标签
voteIlabel = labels[sortedDistIndicies[i]]
classCount[voteIlabel] = classCount.get(voteIlabel,0) + 1
# 给该字典排序,sortedClassCount[0][0]是K中支持的标签数最大的
sortedClassCount = sorted(classCount.items(), key=operator.itemgetter(1), reverse=True)
print(sortedClassCount[0][0])
return sortedClassCount[0][0]
# 创建数据
def createDataSet():
group = array([[1.0,1.1],[1.0,1.0],[0,0],[0,0.1]])
labels = ['A','A','B','B']
return group, labels
# 画图
def draw(xs,ys):
fig = plot.figure()
# 将画布分割成1行1列,图像画在从左到右从上到下的第1块
# 设置画布的大小与图像的位置
ax = fig.add_subplot(221)
# ax.scatter(xs, ys)的两个参数分别是所有点的x坐标,所有点的y坐标
ax.scatter(xs,ys)
plot.show()
def firstTest():
test1 = (1.0, 1.2)
test2 = (0.0, 0.4)
dataset, labels = createDataSet()
conclusion1 = classify0(test1, dataset, labels, 3)
conclusion2 = classify0(test2, dataset, labels, 3)
print(str(test1) + "分类后的结果是属于" + conclusion1 + "类")
print(str(test2) + "分类后的结果是属于" + conclusion2 + "类")
# 将32*32的矩阵读为1*1024
def img2vector(filename):
returnVect = zeros((1,1024))
fr = open(filename)
for i in range(32):
lineStr = fr.readline()
for j in range(32):
returnVect[0,32*i+j] = int(lineStr[j])
return returnVect
def handwritingClassTest():
hwLabels = []
# 获得训练样本数据集
trainingFileList = listdir('digits/trainingDigits')
# 样本数的个数
m = len(trainingFileList)
# 返回m行1024列的矩阵数据
trainingMat = zeros((m, 1024))
# 文件名下划线_左边的数字是标签
for i in range(m):
fileNameStr = trainingFileList[i]
fileStr = fileNameStr.split(".")[0]
# 分类标签
classNumStr = int(fileStr.split('_')[0])
hwLabels.append(classNumStr)
trainingMat[i, :] = img2vector('digits/trainingDigits/%s' % fileNameStr)
testFileList = listdir('digits/testDigits')
errorCount = 0.0
mTest = len(testFileList)
for i in range(mTest):
fileNameStr = testFileList[i]
fileStr = fileNameStr.split('.')[0] # take off .txt
classNumStr = int(fileStr.split('_')[0])
vectorUnderTest = img2vector('digits/testDigits/%s' % fileNameStr)
classifierResult = classify0(vectorUnderTest, trainingMat, hwLabels, 3)
print("the classifier came back with: %d, the real answer is: %d" % (classifierResult, classNumStr))
if (classifierResult != classNumStr): errorCount += 1.0
print("\nthe total number of errors is: %d" % errorCount)
print("\nthe total error rate is: %f" % (errorCount / float(mTest)))
# 主函数调用模块函数
if __name__ == "__main__":
# group,label = createDataSet()
# # group[:, 0] 所有行的第0列
# draw(group[:, 0], group[:, 1])
# # print(group[:, 0])
# firstTest()
handwritingClassTest()
训练数据集合测试集的数据:https://gitee.com/lcl1993213/plist
- KNN实现手写数字识别
KNN实现手写数字识别 博客上显示这个没有Jupyter的好看,想看Jupyter Notebook的请戳KNN实现手写数字识别.ipynb 1 - 导入模块 import numpy as np i ...
- Softmax用于手写数字识别(Tensorflow实现)-个人理解
softmax函数的作用 对于分类方面,softmax函数的作用是从样本值计算得到该样本属于各个类别的概率大小.例如手写数字识别,softmax模型从给定的手写体图片像素值得出这张图片为数字0~9 ...
- 机器学习(二)-kNN手写数字识别
一.kNN算法是机器学习的入门算法,其中不涉及训练,主要思想是计算待测点和参照点的距离,选取距离较近的参照点的类别作为待测点的的类别. 1,距离可以是欧式距离,夹角余弦距离等等. 2,k值不能选择太大 ...
- 一看就懂的K近邻算法(KNN),K-D树,并实现手写数字识别!
1. 什么是KNN 1.1 KNN的通俗解释 何谓K近邻算法,即K-Nearest Neighbor algorithm,简称KNN算法,单从名字来猜想,可以简单粗暴的认为是:K个最近的邻居,当K=1 ...
- kaggle 实战 (1): PCA + KNN 手写数字识别
文章目录 加载package read data PCA 降维探索 选择50维度, 拆分数据为训练集,测试机 KNN PCA降维和K值筛选 分析k & 维度 vs 精度 预测 生成提交文件 本 ...
- Kaggle竞赛丨入门手写数字识别之KNN、CNN、降维
引言 这段时间来,看了西瓜书.蓝皮书,各种机器学习算法都有所了解,但在实践方面却缺乏相应的锻炼.于是我决定通过Kaggle这个平台来提升一下自己的应用能力,培养自己的数据分析能力. 我个人的计划是先从 ...
- 基于OpenCV的KNN算法实现手写数字识别
基于OpenCV的KNN算法实现手写数字识别 一.数据预处理 # 导入所需模块 import cv2 import numpy as np import matplotlib.pyplot as pl ...
- K近邻实战手写数字识别
1.导包 import numpy as np import operator from os import listdir from sklearn.neighbors import KNeighb ...
- C#中调用Matlab人工神经网络算法实现手写数字识别
手写数字识别实现 设计技术参数:通过由数字构成的图像,自动实现几个不同数字的识别,设计识别方法,有较高的识别率 关键字:二值化 投影 矩阵 目标定位 Matlab 手写数字图像识别简介: 手写 ...
- 利用神经网络算法的C#手写数字识别
欢迎大家前往云+社区,获取更多腾讯海量技术实践干货哦~ 下载Demo - 2.77 MB (原始地址):handwritten_character_recognition.zip 下载源码 - 70. ...
随机推荐
- SpringMVC Spring MyBatis整合配置文件
1.spring管理SqlSessionFactory.mapper 1)在classpath下创建mybatis/sqlMapConfig.xml <?xml version="1. ...
- ASP动态网站建设之连接数据库相关操作
连接数据库: string str = @"server=服务器名称;Integrated Security=SSPI;database=数据库名称;"; 注意封装公共类,将常用重 ...
- LeetCode 442. Find All Duplicates in an Array (在数组中找到所有的重复项)
Given an array of integers, 1 ≤ a[i] ≤ n (n = size of array), some elements appear twice and others ...
- LeetCode 370. Range Addition (范围加法)$
Assume you have an array of length n initialized with all 0's and are given k update operations. Eac ...
- python的小基础
变量python中的变量为指向常量的地址当常量没有指向时,系统自动回收内存空间如A = 1B = AA = 2print(A,B)#2,1id(A),id(B)id()为python虚拟机的虚拟地址, ...
- JAVA提高十一:LinkedList深入分析
上一节,我们学习了ArrayList 类,本节我们来学习一下LinkedList,LinkedList相对ArrayList而言其使用频率并不是很高,因为其访问元素的性能相对于ArrayList而言比 ...
- 通过对DAO层的封装减少数据库操作的代码量
在学框架之前,写项目时总是要花大量的时间去写数据库操作层代码,这样会大大降低我们的效率,为了解决这个问题,我花了两天时间利用反射机制和泛型将DAO层进行了封装,这样我们只需要写sql语句,不需要再写 ...
- 你不知的DOM编程
前言:随着vue,react, angular的流行,可能现在我们不必经常的操作DOM,三大框架在副交互的操作中发挥着极大地优势.因为我们知道用脚本对DOM的操作非常昂贵,本文主要探讨常规的DOM操作 ...
- 初入WebService
搭建webservice需要用到的jar applicationContext.xml配置文件 <?xml version="1.0" encoding="UTF- ...
- javascript 中parseInt 的用法
<!DOCTYPE html> <html> <head lang="en"> <meta charset="UTF-8 ...