Acceso al HDFS con snakebite usando Python#
Última modificación: Mayo 13, 2022
Listado de archivos en el HDFS#
[1]:
from snakebite.client import Client
client = Client('localhost', 9000)
for x in client.ls(['/']):
print(x)
{'file_type': 'd', 'permission': 511, 'path': '/tmp', 'length': 0, 'owner': 'root', 'group': 'supergroup', 'block_replication': 0, 'modification_time': 1652728033055, 'access_time': 0, 'blocksize': 0}
{'file_type': 'd', 'permission': 493, 'path': '/user', 'length': 0, 'owner': 'root', 'group': 'supergroup', 'block_replication': 0, 'modification_time': 1652728036942, 'access_time': 0, 'blocksize': 0}
[2]:
for x in client.ls(['/']):
print(x['path'])
/tmp
/user
Comandos disponibles#
cat [paths] copy source paths to stdout
chgrp <grp> [paths] change group
chmod <mode> [paths] change file mode (octal)
chown <owner:grp> [paths] change owner
copyToLocal [paths] dst copy paths to local file system destination
count [paths] display stats for paths
df display fs stats
du [paths] display disk usage statistics
get file dst copy files to local file system destination
getmerge dir dst concatenates files in source dir into destination local file
ls [paths] list a path
mkdir [paths] create directories
mkdirp [paths] create directories and their parents
mv [paths] dst move paths to destination
rm [paths] remove paths
rmdir [dirs] delete a directory
serverdefaults show server information
setrep <rep> [paths] set replication factor
stat [paths] stat information
tail path display last kilobyte of the file to stdout
test path test a path
text path [paths] output file in text format
touchz [paths] creates a file of zero length
usage <cmd> show cmd usage
Creación de un directorio en /tmp/#
[3]:
from snakebite.client import Client
client = Client('localhost', 9000)
for x in client.mkdir(['/tmp/demo']):
print(x)
{'path': '/tmp/demo', 'result': True}
Creación de un archivo vacio en el HDFS#
[4]:
for p in client.touchz(['/tmp/demo/text.txt']):
print(p)
{'path': '/tmp/demo/text.txt', 'result': True}
Listado del contenido de un archivo en el HDFS#
[5]:
for p in client.cat(['/tmp/demo/text.txt']):
for line in p:
print(line)
Borrado#
[6]:
for p in client.delete(['/tmp/demo'], recurse=True):
print(p)
{'path': '/tmp/demo', 'result': True}