{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Descubrimiento de reglas de asociación en tags de proyectos de software\n", "\n", "* *60 min* | Ultima modificación: Noviembre 26, 2020" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Este tutorial esta basado en *Mastering Data Mining with Python, Megan Squire, 2016. Packt Publishing*. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "El archivo `project_tags.csv` contiene los tags asociados a diferentes proyectos de software por los desarrolladdores. La primera columna corresponde al ID del proyecto; la segunda al tag asignado. Se desean construir reglas que permiten sugerir un tag a partir de dos tags previamente seleccionados por el usuario." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import sqlite3\n", "\n", "conn = sqlite3.connect(\":memory:\")\n", "cursor = conn.cursor()" ] }, { "cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "--2020-11-29 04:47:25-- https://raw.githubusercontent.com/jdvelasq/datalabs/master/datasets/project_tags.csv\n", "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 199.232.48.133\n", "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|199.232.48.133|:443... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 5418355 (5.2M) [text/plain]\n", "Saving to: ‘project_tags.csv.1’\n", "\n", "project_tags.csv.1 100%[===================>] 5.17M 4.52MB/s in 1.1s \n", "\n", "2020-11-29 04:47:27 (4.52 MB/s) - ‘project_tags.csv.1’ saved [5418355/5418355]\n", "\n" ] } ], "source": [ "!wget https://raw.githubusercontent.com/jdvelasq/datalabs/master/datasets/project_tags.csv" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Carga de los datos" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "conn.executescript(\"\"\"\n", "DROP TABLE IF EXISTS project_tags;\n", "\n", "CREATE TABLE project_tags \n", "(\n", " project_id INT NOT NULL DEFAULT '0',\n", " tag_name STRING NOT NULL DEFAULT '0',\n", " PRIMARY KEY (project_id, tag_name)\n", ");\n", "\"\"\")\n", "\n", "conn.commit()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[('36762', 'Database Engines/Servers'),\n", " ('14882', 'Systems Administration'),\n", " ('53184', 'C'),\n", " ('41895', 'multimedia'),\n", " ('53266', 'Desktop Environment')]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "with open('project_tags.csv', 'rt') as f:\n", " data = f.readlines()\n", "\n", "## Elimina el '\\n' al final de la línea\n", "data = [line.replace('\\n', '') for line in data]\n", "\n", "## Separa los campos por comas\n", "data = [line.split(',') for line in data]\n", "\n", "## Convierte la fila en una tupla\n", "data = [tuple(line) for line in data]\n", "\n", "## Elimina valores duplicados\n", "data = list(set([tuple(line) for line in data]))\n", "\n", "\n", "## Imprime los primeros 5 registros para verificar\n", "data[0:5]" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[(36762, 'Database Engines/Servers'),\n", " (14882, 'Systems Administration'),\n", " (53184, 'C'),\n", " (41895, 'multimedia'),\n", " (53266, 'Desktop Environment')]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "##\n", "## Carga a partir de la lista de tuplas\n", "## contenidas en data\n", "##\n", "cursor.executemany('INSERT INTO project_tags VALUES (?,?)', data)\n", "\n", "##\n", "## Verificación\n", "##\n", "cursor.execute(\"SELECT * FROM project_tags LIMIT 5;\").fetchall()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Información básica" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[(353401,)]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "##\n", "## Cantidad de registros\n", "## \n", "cursor.execute(\"SELECT COUNT(*) FROM project_tags;\").fetchone()[0]" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(46511,)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "##\n", "## Cantidad de proyectos\n", "##\n", "cursor.execute(\"SELECT COUNT(DISTINCT project_id) FROM project_tags;\").fetchone()[0]" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "46511" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "##\n", "## Cantidad de proyectos\n", "## Se toma como baskets la cantidad de proyectos en la tabla\n", "##\n", "baskets = cursor.execute(\"SELECT COUNT(DISTINCT project_id) FROM project_tags;\").fetchone()[0]\n", "baskets" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Número de proyectos por tag y soporte" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "GPL 21176 45.53%\n", "POSIX 16868 36.27%\n", "Linux 16284 35.01%\n", "C 10288 22.12%\n", "OS Independent 10178 21.88%\n", "Software Development 9614 20.67%\n", "Internet 8097 17.41%\n", "Windows 7572 16.28%\n", "Java 6390 13.74%\n", "Web 6264 13.47%\n", "English 5997 12.89%\n", "C++ 5891 12.67%\n", "Libraries 5738 12.34%\n", "PHP 5448 11.71%\n", "Unix 5098 10.96%\n", "Mac OS X 4823 10.37%\n", "multimedia 4813 10.35%\n", "Communications 4449 9.57%\n", "Perl 4242 9.12%\n", "Python 4190 9.01%\n", "LGPL 3524 7.58%\n", "Utilities 3297 7.09%\n", "Dynamic Content 3199 6.88%\n", "GPLv3 2875 6.18%\n", "Networking 2819 6.06%\n", "Scientific/Engineering 2678 5.76%\n", "Games/Entertainment 2528 5.44%\n", "BSD 2494 5.36%\n", "Desktop Environment 2335 5.02%\n", "Graphics 2268 4.88%\n", "Database 2200 4.73%\n", "GPLv2 2147 4.62%\n", "Text Processing 2131 4.58%\n", "Sound/Audio 2094 4.50%\n", "Security 1960 4.21%\n" ] } ], "source": [ "##\n", "## Número de proyectos por tag\n", "##\n", "x = cursor.execute(\n", "\"\"\"\n", " SELECT \n", " tag_name, \n", " COUNT(project_id), \n", " ROUND(\n", " COUNT(project_id) * 100.0 / (SELECT COUNT(DISTINCT project_id) FROM project_tags), \n", " 2)\n", " FROM \n", " project_tags\n", " GROUP BY \n", " 1\n", " ORDER BY \n", " 2 DESC\n", " LIMIT \n", " 35;\n", "\"\"\")\n", "\n", "##\n", "## Un 5% equivale aprox a 2335 proyectos\n", "##\n", "for tag, value, pct in x.fetchall():\n", " print(\"{:23s} {:6d} {:6.2f}%\".format(tag, value, pct))" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Minimum support count: 2325.55 (5% of bastkets)\n" ] } ], "source": [ "#\n", "# Soporte mínimo\n", "#\n", "MIN_SUPPORT_PCT = 5\n", "\n", "#\n", "# Descarta el porcentaje especificado (MIN_SUPPORT_PCT) de tags menos frecuentes.\n", "# Se require que el tag aparezca en 554 proyectos o mas (de 46511 proyectos existentes)\n", "#\n", "minsupport = baskets * (MIN_SUPPORT_PCT / 100)\n", "print(\n", " \"Minimum support count: {} ({}% of bastkets)\".format(minsupport, MIN_SUPPORT_PCT),\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Singletons" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['Apache 2.0', 'Application Frameworks', 'Archiving', 'Artistic', 'BSD']" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "##\n", "## Descarta los tags menos frecuentes. Singletons es una \n", "## lista de tuplas de la siguiente forma:\n", "##\n", "## [('Apache 2.0',),\n", "## ('Application Frameworks',),\n", "## ('Archiving',),\n", "## ...\n", "## ]\n", "##\n", "singletons = cursor.execute(\n", " \"\"\"\n", " SELECT \n", " DISTINCT tag_name\n", " FROM \n", " project_tags\n", " GROUP BY \n", " 1 \n", " HAVING \n", " COUNT(project_id) >= {} \n", " ORDER BY \n", " tag_name\n", " \"\"\".format(\n", " minsupport\n", " )\n", ").fetchall()\n", "\n", "##\n", "## Esta variable contiene todos los tags que aparecen\n", "## en, al menos, el 5% de los proyectos\n", "##\n", "allSingletonTags = [x[0] for x in singletons]\n", " \n", "allSingletonTags[:5]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Doubletons" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [], "source": [ "##\n", "## La siguiente tabla contiene la cantidad de proyectos\n", "## que tienen tag1 y tag2 simultáneamente\n", "##\n", "conn.executescript(\n", " \"\"\"\n", " DROP TABLE IF EXISTS project_tag_pairs;\n", "\n", " CREATE TABLE project_tag_pairs \n", " (\n", " tag1 STRING,\n", " tag2 STRING,\n", " num_projs INT\n", " );\n", " \"\"\"\n", ")\n", "\n", "conn.commit()" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(0, 1)\n", "(0, 2)\n", "(0, 3)\n", "(1, 2)\n", "(1, 3)\n", "(2, 3)\n" ] } ], "source": [ "from itertools import combinations\n", "\n", "##\n", "## Uso de itertools.combinations\n", "##\n", "x = [0, 1, 2, 3]\n", "for w in list(combinations(x, 2)):\n", " print(w)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "...................................." ] } ], "source": [ "##\n", "## Tags que aparecen unicamente en las\n", "## combinaciones admisibles de dos tags\n", "## diferentes\n", "##\n", "allDoubletonTags = set()\n", "\n", "##\n", "## Tuplas unicas formadas por (tag0, tag1)\n", "##\n", "doubletonSet = set()\n", "\n", "\n", "def findDoubletons():\n", "\n", " ##\n", " ## INNER JOIN retorna lo registros que aparecen\n", " ## simultaneamente en las dos tablas (intersección)\n", " ##\n", " ## La siguiente consulta retorna cuantos proyectos usan\n", " ## tag1 y tag2 simultaneamente.\n", " ##\n", " ## Si:\n", " ##\n", " ## prj0, tag0\n", " ## prj0, tag1\n", " ## prj0, tag2\n", " ## prj1, tag0\n", " ## prj1, tag1\n", " ## prj1, tag3\n", " ## prj2, tag0\n", " ## prj2, tag3\n", " ## ...\n", " ##\n", " ## El inner join con tag0 y tag1 genera:\n", " ##\n", " ## prj0, tag0, prj0, tag1\n", " ## prj1, tag0, prj1, tag1\n", " ## ...\n", " ##\n", " getDoubletonFrequencyQuery = \"\"\"\n", " SELECT \n", " count(t1.project_id) \n", " FROM \n", " project_tags t1\n", " INNER JOIN \n", " project_tags t2\n", " ON \n", " t1.project_id = t2.project_id\n", " WHERE \n", " (\n", " t1.tag_name = '{}'\n", " AND t2.tag_name = '{}'\n", " )\n", " \"\"\"\n", "\n", " insertPairQuery = \"\"\"\n", " INSERT INTO \n", " project_tag_pairs (tag1, tag2, num_projs)\n", " VALUES \n", " ('{}','{}',{})\n", " \"\"\"\n", "\n", " ##\n", " ## Genera todas las combinaciones de dos tags usando\n", " ## los tags individuales que cumplen con una ocurrencia\n", " ## minima\n", " ##\n", " doubletonCandidates = list(combinations(allSingletonTags, 2))\n", "\n", " for (index, candidate) in enumerate(doubletonCandidates):\n", "\n", " tag1 = candidate[0]\n", " tag2 = candidate[1]\n", "\n", " ##\n", " ## Cuenta la cantidad de proyectos que usan tag1 y tag2 simultaneamente\n", " ##\n", " count = cursor.execute(\n", " getDoubletonFrequencyQuery.format(tag1, tag2)\n", " ).fetchone()[0]\n", "\n", " \n", " if count > minsupport:\n", " \n", " ## Don't panic!: reporta que se esta ejecutando.\n", " print(\".\", sep=\"\", end=\"\")\n", "\n", " \n", " cursor.execute(insertPairQuery.format(tag1, tag2, count))\n", "\n", " ##\n", " ## Inserta la tupla (tag1, tag2) en la tabla \n", " ##\n", " doubletonSet.add(candidate)\n", "\n", " ##\n", " ## Agrega los tags a la lista de tags usados \n", " ## \n", " allDoubletonTags.add(tag1)\n", " allDoubletonTags.add(tag2)\n", "\n", "\n", "findDoubletons()" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "C GPL 5539\n", "C Linux 5648\n", "C POSIX 6952\n", "C++ GPL 2911\n", "C++ Linux 3425\n", "C++ POSIX 3501\n", "Communications GPL 2578\n", "Dynamic Content Internet 3171\n", "Dynamic Content Web 3170\n", "English Linux 2660\n", "GPL Internet 4035\n", "GPL Linux 8036\n", "GPL OS Independent 4403\n", "GPL PHP 2372\n", "GPL POSIX 10062\n", "GPL Software Development 3318\n", "GPL Web 2899\n", "GPL Windows 2603\n", "GPL multimedia 2879\n", "Internet OS Independent 3005\n", "Internet POSIX 2831\n", "Internet Web 5973\n", "Java OS Independent 3433\n", "Java Software Development 2356\n", "Libraries Software Development 5633\n", "Linux Mac OS X 2973\n", "Linux POSIX 11896\n", "Linux Software Development 2335\n", "Linux Unix 2493\n", "Linux Windows 5279\n", "Mac OS X Windows 3131\n", "OS Independent Software Development 3564\n", "OS Independent Web 2602\n", "POSIX Software Development 3501\n", "POSIX Windows 4464\n", "POSIX multimedia 2538\n" ] } ], "source": [ "x = cursor.execute(\n", "\"\"\"\n", " SELECT \n", " *\n", " FROM \n", " project_tag_pairs\n", " ORDER BY \n", " 1 ASC,\n", " 2 ASC;\n", "\"\"\")\n", "\n", "##\n", "## Un 5% equivale aprox a 2335 proyectos\n", "##\n", "for tag1, tag2, num_projs in x.fetchall():\n", " print(\"{:23s} {:23s} {:6d}\".format(tag1, tag2, num_projs))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tripletons" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": [ "##\n", "## La siguiente tabla contiene la cantidad de proyectos\n", "## que tienen tag1, tag2 y tag3 simultáneamente\n", "##\n", "conn.executescript(\"\"\"\n", "DROP TABLE IF EXISTS project_tag_triples;\n", "\n", "CREATE TABLE project_tag_triples \n", "(\n", " tag1 STRING,\n", " tag2 STRING,\n", " tag3 STRING,\n", " num_projs INT\n", ");\n", "\"\"\")\n", "\n", "conn.commit()" ] }, { "cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ ".........*\n" ] } ], "source": [ "def findTripletons():\n", "\n", " ##\n", " ## Sigue una lógica similar a la usada anteriormente\n", " ##\n", " getTripletonFrequencyQuery = \"\"\"\n", " SELECT \n", " count(t1.project_id)\n", " FROM \n", " project_tags t1\n", " INNER JOIN \n", " project_tags t2\n", " ON \n", " t1.project_id = t2.project_id\n", " INNER JOIN \n", " project_tags t3\n", " ON \n", " t2.project_id = t3.project_id\n", " WHERE\n", " (\n", " t1.tag_name = '{}'\n", " AND t2.tag_name = '{}'\n", " AND t3.tag_name = '{}'\n", " )\n", " \"\"\"\n", "\n", " insertTripletonQuery = \"\"\"\n", " INSERT INTO project_tag_triples(tag1, tag2, tag3, num_projs)\n", " VALUES ('{}','{}','{}',{})\n", " \"\"\"\n", "\n", " ##\n", " ## Crea tripletas ordenadas con los tags que aparecen en dos proyectos y\n", " ##  cumplen con el soporte minimo\n", " ##\n", " tripletonCandidates = [\n", " sorted(tc) for tc in list(combinations(allDoubletonTags, 3))\n", " ]\n", " \n", " for index, candidate in enumerate(tripletonCandidates):\n", "\n", " ##\n", " ## La tripleta contiene, al menos, una tupla que esta en la\n", " ## la lista de doubleTons\n", " ##\n", " if any(\n", " [\n", " tuple_ in doubletonSet\n", " for tuple_ in list(combinations(candidate, 2))\n", " ]\n", " ):\n", "\n", " ##\n", " ## Computa la frecuencia de la tripleta\n", " ##\n", " count = cursor.execute(\n", " getTripletonFrequencyQuery.format(candidate[0], candidate[1], candidate[2])\n", " ).fetchone()[0]\n", "\n", " ##\n", " ## Inserta las tripletas que cumplen con la frecuencia mínima\n", " ## \n", " if count > minsupport:\n", "\n", " print(\".\", sep=\"\", end=\"\")\n", "\n", " cursor.execute(\n", " insertTripletonQuery.format(\n", " candidate[0], candidate[1], candidate[2], count\n", " ),\n", " )\n", " print('*')\n", "\n", "\n", "findTripletons()" ] }, { "cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "C GPL Linux 3295\n", "C GPL POSIX 4360\n", "C Linux POSIX 4625\n", "C++ Linux POSIX 2621\n", "Dynamic Content Internet Web 3163\n", "GPL Internet Web 2874\n", "GPL Linux POSIX 7379\n", "Internet OS Independent Web 2516\n", "Linux POSIX Windows 3312\n" ] } ], "source": [ "x = cursor.execute(\n", "\"\"\"\n", " SELECT \n", " *\n", " FROM \n", " project_tag_triples\n", " ORDER BY \n", " 1 ASC,\n", " 2 ASC,\n", " 3 ASC;\n", "\"\"\")\n", "\n", "\n", "for tag1, tag2, tag3, num_projs in x.fetchall():\n", " print(\"{:23s} {:23s} {:23s} {:6d}\".format(tag1, tag2, tag3, num_projs))" ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [], "source": [ "def calcSCAV(tagA, tagB, tagC, ruleSupport, file):\n", " ##\n", " ## Support\n", " ##\n", " ruleSupportPct = round((ruleSupport / baskets), 2)\n", "\n", " ##\n", " ## Confidence\n", " ##\n", " queryConf = \"\"\"\n", " SELECT num_projs\n", " FROM project_tag_pairs \n", " WHERE \n", " (\n", " (tag1 = '{}' AND tag2 = '{}') \n", " OR (tag2 = '{}' AND tag1 = '{}')\n", " )\n", " \"\"\"\n", "\n", " pairSupport = cursor.execute(queryConf.format(tagA, tagB, tagA, tagB)).fetchone()[0]\n", "\n", " confidence = round((ruleSupport / pairSupport), 2)\n", "\n", " ## \n", " ## Added Value\n", " ##\n", " queryAV = \"\"\"\n", " SELECT count(*) \n", " FROM project_tags \n", " WHERE tag_name= '{}'\n", " \"\"\"\n", " \n", " supportTagC = cursor.execute(queryAV.format(tagC)).fetchone()[0]\n", " \n", " supportTagCPct = supportTagC / baskets\n", " \n", " addedValue = round((confidence - supportTagCPct), 2)\n", "\n", " print(\n", " \"{}, {} -> {} [S={}, C={}, AV={}]\".format(\n", " tagA, tagB, tagC, ruleSupportPct, confidence, addedValue\n", " ),\n", " file=file,\n", " )\n", "\n", "\n", "def generateRules():\n", "\n", " ##\n", " ## Consulta para obtiener las tripletas para obtener las reglas\n", " ##\n", " getFinalListQuery = \"\"\"\n", " SELECT tag1, tag2, tag3, num_projs FROM project_tag_triples\n", " \"\"\"\n", "\n", " ##\n", " ## Obtiene las tripletas\n", " ##\n", " triples = cursor.execute(getFinalListQuery).fetchall()\n", "\n", " with open(\"report.txt\", \"w\") as file:\n", "\n", " for triple in triples:\n", "\n", " tag1 = triple[0]\n", " tag2 = triple[1]\n", " tag3 = triple[2]\n", " ruleSupport = triple[3]\n", "\n", " calcSCAV(tag1, tag2, tag3, ruleSupport, file)\n", " calcSCAV(tag1, tag3, tag2, ruleSupport, file)\n", " calcSCAV(tag2, tag3, tag1, ruleSupport, file)\n", " print(\"*\", file=file)\n", "\n", "\n", "generateRules()" ] }, { "cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Dynamic Content, Internet -> Web [S=0.07, C=1.0, AV=0.87]\n", "Dynamic Content, Web -> Internet [S=0.07, C=1.0, AV=0.83]\n", "Internet, Web -> Dynamic Content [S=0.07, C=0.53, AV=0.46]\n", "*\n", "Internet, OS Independent -> Web [S=0.05, C=0.84, AV=0.71]\n", "Internet, Web -> OS Independent [S=0.05, C=0.42, AV=0.2]\n", "OS Independent, Web -> Internet [S=0.05, C=0.97, AV=0.8]\n", "*\n", "GPL, Internet -> Web [S=0.06, C=0.71, AV=0.58]\n", "GPL, Web -> Internet [S=0.06, C=0.99, AV=0.82]\n", "Internet, Web -> GPL [S=0.06, C=0.48, AV=0.02]\n", "*\n", "C, Linux -> POSIX [S=0.1, C=0.82, AV=0.46]\n", "C, POSIX -> Linux [S=0.1, C=0.67, AV=0.32]\n", "Linux, POSIX -> C [S=0.1, C=0.39, AV=0.17]\n", "*\n", "C, GPL -> POSIX [S=0.09, C=0.79, AV=0.43]\n", "C, POSIX -> GPL [S=0.09, C=0.63, AV=0.17]\n", "GPL, POSIX -> C [S=0.09, C=0.43, AV=0.21]\n", "*\n", "C, GPL -> Linux [S=0.07, C=0.59, AV=0.24]\n", "C, Linux -> GPL [S=0.07, C=0.58, AV=0.12]\n", "GPL, Linux -> C [S=0.07, C=0.41, AV=0.19]\n", "*\n", "Linux, POSIX -> Windows [S=0.07, C=0.28, AV=0.12]\n", "Linux, Windows -> POSIX [S=0.07, C=0.63, AV=0.27]\n", "POSIX, Windows -> Linux [S=0.07, C=0.74, AV=0.39]\n", "*\n", "C++, Linux -> POSIX [S=0.06, C=0.77, AV=0.41]\n", "C++, POSIX -> Linux [S=0.06, C=0.75, AV=0.4]\n", "Linux, POSIX -> C++ [S=0.06, C=0.22, AV=0.09]\n", "*\n", "GPL, Linux -> POSIX [S=0.16, C=0.92, AV=0.56]\n", "GPL, POSIX -> Linux [S=0.16, C=0.73, AV=0.38]\n", "Linux, POSIX -> GPL [S=0.16, C=0.62, AV=0.16]\n", "*\n" ] } ], "source": [ "!head -n 48 report.txt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" } }, "nbformat": 4, "nbformat_minor": 4 }