Acceso al HDFS con snakebite usando Python#

  • Última modificación: Mayo 13, 2022

Listado de archivos en el HDFS#

[1]:
from snakebite.client import Client

client = Client('localhost', 9000)
for x in client.ls(['/']):
    print(x)
{'file_type': 'd', 'permission': 511, 'path': '/tmp', 'length': 0, 'owner': 'root', 'group': 'supergroup', 'block_replication': 0, 'modification_time': 1652728033055, 'access_time': 0, 'blocksize': 0}
{'file_type': 'd', 'permission': 493, 'path': '/user', 'length': 0, 'owner': 'root', 'group': 'supergroup', 'block_replication': 0, 'modification_time': 1652728036942, 'access_time': 0, 'blocksize': 0}
[2]:
for x in client.ls(['/']):
    print(x['path'])
/tmp
/user

Comandos disponibles#

cat [paths]                    copy source paths to stdout
chgrp <grp> [paths]            change group
chmod <mode> [paths]           change file mode (octal)
chown <owner:grp> [paths]      change owner
copyToLocal [paths] dst        copy paths to local file system destination
count [paths]                  display stats for paths
df                             display fs stats
du [paths]                     display disk usage statistics
get file dst                   copy files to local file system destination
getmerge dir dst               concatenates files in source dir into destination local file
ls [paths]                     list a path
mkdir [paths]                  create directories
mkdirp [paths]                 create directories and their parents
mv [paths] dst                 move paths to destination
rm [paths]                     remove paths
rmdir [dirs]                   delete a directory
serverdefaults                 show server information
setrep <rep> [paths]           set replication factor
stat [paths]                   stat information
tail path                      display last kilobyte of the file to stdout
test path                      test a path
text path [paths]              output file in text format
touchz [paths]                 creates a file of zero length
usage <cmd>                    show cmd usage

Creación de un directorio en /tmp/#

[3]:
from snakebite.client import Client

client = Client('localhost', 9000)
for x in client.mkdir(['/tmp/demo']):
    print(x)
{'path': '/tmp/demo', 'result': True}

Creación de un archivo vacio en el HDFS#

[4]:
for p in client.touchz(['/tmp/demo/text.txt']):
    print(p)
{'path': '/tmp/demo/text.txt', 'result': True}

Listado del contenido de un archivo en el HDFS#

[5]:
for p in client.cat(['/tmp/demo/text.txt']):
    for line in p:
        print(line)

Borrado#

[6]:
for p in client.delete(['/tmp/demo'], recurse=True):
    print(p)
{'path': '/tmp/demo', 'result': True}