databankr

Casio Databank/Telememo record format encoder/decoder
git clone git://git.luxferre.top/databankr.git
Log | Files | Refs | README

commit d23672269e3ebcbf1361d022eb631591927e84f9
Author: Luxferre <lux@ferre>
Date:   Tue, 21 May 2024 14:51:05 +0300

Initial upload

Diffstat:
AREADME | 138+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Aconfig.json | 34++++++++++++++++++++++++++++++++++
Adatabankr.py | 187+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
3 files changed, 359 insertions(+), 0 deletions(-)

diff --git a/README b/README @@ -0,0 +1,138 @@ +Databankr: store arbitrary data inside Casio Databank/Telememo watches +---------------------------------------------------------------------- +This is a Python utility program that allows you to encode arbitrary pieces +of information into the format that could be entered into Casio watches with +the databank/telememo function support, as well as retrieve and decode this +information later. It supports both raw and hexadecimal data, as well as both +file and standard input/output. + +== Usage == + +The program can be run like this (as always, use -h flag to see live help): + +python3 databankr.py [enc/dec] [-t TYPE] [-i INPUT_FILE] [-o OUTPUT_FILE] + [-c CONFIG] [-m MODULE] [-l EXPECTED_LENGTH] + +where the mode (enc/dec) parameter is mandatory, and the optional ones are: + +* -t: data type to be encoded or decoded (bin or hex, bin by default, + which means raw data) +* -i: source input file path (default "-", which means stdin) +* -o: result output file path (default "-", which means stdout) +* -c: configuration file path (default "config.json" in the current dir) +* -m: module configuration code according to your watch (default 2515-lat) +* -l: expected decoded data length in bytes (default is 0 - no limits) + +In the encoding mode, the program outputs (to a file or the standard output) +a set of double newline separated records that you need to enter into the +databank/telememo function of your watch. Note that the amount of characters +is fixed for every model, so if the record name/number contains less of them, +then you must enter whitespaces into the rest. The input to the encoding mode +can also be a file or the standard input, and the data type flag (-t) defines +whether this is a raw binary file or a hexadecimal string. + +In the decoding mode, the program expects a set of double newline separated +databank/telememo records from a file or the standard input and outputs the +reconstructed data into a file or the standard output according to the data +type flag. By default, the output is bit-aligned with the number of records +and the amount of bits held by each of them, so the excess data are filled +with null bytes. If you know the exact byte count of the source data, you can +pass the -l flag to strip the restored information to the desired length. + +If you choose to enter the data manually via the standard input, press Ctrl+D +when done. This works in both modes. + +Keep in mind that the records are case-sensitive: you must only enter the +letters in exactly the same case it has been specified in the configuration +section for the module of your choice. Most of the time, it will be upper case +for the name parts of the records. + +== Configuration format == + +The basic configuration file is shipped with Databankr and is suitable for +several popular databank-enabled Casio models, but you can always extend it +to support other ones if you know the structure of their records. The file is +a normal JSON object where the keys are module identifiers (not necessarily +matching the only module it can work on), and the values are module config +objects. Each such object contains the following fields: + +* description: a human-readable module description displayed by Databankr +* namelen: the name field length in a databank record (in characters) +* numberlen: the number field length in a databank record (in characters) +* alpha: the entire character set that can be entered into a name field +* digit: the entire character set that can be entered into a number field +* index: a subset of the "alpha" charset sorted alphabetically that's used + for record indexing; its length must be equal to the total amount + of records in the watch + +The namelen and numberlen fields are integers, all others are strings. The +"index" field is necessary because all Databank/Telememo-enabled Casio watches +utilize automatic sorting, so, to preserve the data order, the first character +in the name part of the record actually is used to index the records and not +store the data payload itself. + +== FAQ == + +- How and why this was invented? + +Databankr started in early 2023 as a JavaScript library with a different name, +Telememer, that only catered to Casios with 2757 and 5574 modules (like AW-80, +AMW-870 and so on). It was created as an attempt to turn the Telememo function +of these watches into a kind of universal storage for arbitrary binary data, +as such a storage is pretty much unhackable and only accessible to those who +physically uses the watch. Besides, a phone or even a paper notebook are much +more likely to be stolen, searched or confiscated than a cheap wristwatch from +Casio. Then, in mid-2024, Databankr was created in its current form of a CLI +application written in Python 3, supporting several different Casio modules +out of the box. + +- How much data can we store this way? + +The overall formula of bits per record looks like this: + +bits = |number_len * log2(digits)| + |(name_len - 1) * log2(chars)|, + +where "digits" is how many different digits we can enter into the number part, +"chars" is how many different characters we can enter into the name part, and +"number_len" and "name_len" are the length of the number and name fields +respectively. Then we can multiply this number by the amount of records and it +will be the total storage. For example, with the default "2515-lat" module +configuration (which corresponds to a Casio DB-36/DB-360 watch set to English +or Dutch language), we can store 95 bits per record which translates to 2850 +bits or 356 bytes of information in the entire databank. + +- What kind of information can I store in such limited space? + +If you happen to own an old and more advanced Casio Databank model (with 50, +100 or even 150 records), you'll find even more possibilities (after creating +your own configuration section for that model, of course). However, even 2850 +bits is still over 2048, which means you can store several cryptographic keys, +important URLs and passwords (in an encrypted fashion) or other information +that you don't need to glance at but need to be able to recover if you're only +storing it on this particular Casio. Besides databank capacity, the only real +tradeoff is your own readiness to manually enter the records into the watch +and then retype them into the program (or a file) whenever you need to recover +the information. + +- What happens if I enter more data that can be stored on encoding? + +It will be truncated prior to converting to records. Only one record set is +supported at the moment. + +- Which modules are supported as of now? + +Currently, Databankr comes with the configurations for the following modules: + +* 2747: Casio modules 2747 and 5574 +* 2515-lat: Casio module 2515, basic Latin characters +* 2515-cyr: Casio module 2515, Cyrillic characters +* 2515-por: Casio module 2515, Portuguese characters + +Even though the program itself is considered complete, the configuration list +is expected to grow in the future. Of course, everyone is encouraged to append +their own configurations according to the "Configuration format" section. + +== Credits == + +Created by Luxferre in 2024. Released into public domain with no warranties. + diff --git a/config.json b/config.json @@ -0,0 +1,34 @@ +{ + "2747": { + "description": "Casio 2747/5574 modules", + "namelen": 8, + "numberlen": 16, + "alpha": " ABCDEFGHIJKLMNOPQRSTUVWXYZ@!?',.;:()/+-0123456789", + "digit": " 0123456789()+-", + "index": "ABCDEFGHIJKLMNOPQRSTUVWXY12345" + }, + "2515-lat": { + "description": "Casio 2515 module - basic Latin (English/Dutch)", + "namelen": 8, + "numberlen": 15, + "alpha": " ABCDEFGHIJKLMNOPQRSTUVWXYZ@!?'.:/+-0123456789", + "digit": "-0123456789() ", + "index": "ABCDEFGHIJKLMNOPQRSTUVWXY12345" + }, + "2515-cyr": { + "description": "Casio 2515 module - Cyrillic", + "namelen": 8, + "numberlen": 15, + "alpha": " АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ@!?'.:/+-0123456789", + "digit": "-0123456789() ", + "index": "АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЭЯ12345" + }, + "2515-por": { + "description": "Casio 2515 module - Portuguese", + "namelen": 8, + "numberlen": 15, + "alpha": " AÁÀÂÃBCÇDEÉÊFGHIÍJKLMNOÓÔÕPQRSTUÚVWXYZ@!?'.:/+-0123456789", + "digit": "-0123456789() ", + "index": "ABCDEFGHIJKLMNOPQRSTUVWXY12345" + } +} diff --git a/databankr.py b/databankr.py @@ -0,0 +1,187 @@ +#!/usr/bin/env python3 +# Databankr: a CLI tool to encode arbitrary data to Casio Databank/Telememo +# watches and restore it from there +# Created by Luxferre in 2024, released into public domain + +import sys, math, json, re + +# universal base conversion methods + +def to_base(number, base, charset): + if not number: + return charset[0] + res = '' + while number > 0: + res = charset[number % base] + res + number //= base + return res + +def from_base(number, base, charset): + if number == charset[0]: + return 0 + res = 0 + for c in number: + ind = charset.find(c) + if ind > -1: + res = res * base + ind + else: + return 0 + return res + +# record preparation methods + +# create a padded field from a numeric value +def val_to_field(n, charset, padlen): + return to_base(n, len(charset), charset).rjust(padlen, charset[0]) + +# main encoding method +def encode(data: bytes, config): + effective_namelen = config['namelen'] - 1 + alphabase = len(config['alpha']) + numbase = len(config['digit']) + indexsize = len(config['index']) + # calculate record estimation + name_part_bits = int(effective_namelen * math.log2(alphabase)) + number_part_bits = int(config['numberlen'] * math.log2(numbase)) + record_bits = name_part_bits + number_part_bits # single record capacity + max_bits = record_bits * indexsize # overall databank capacity + # start processing + bitstr = ''.join(f'{c:08b}' for c in data) # create an aligned bitstring + bitlen = len(bitstr) # overall bitstring length + if bitlen > max_bits: # truncate the excess + bitlen = max_bits + bitstr = bitstr[:max_bits] + rec_len = int(math.ceil(bitlen / record_bits)) # message length in records + records = [] # list of lists + pos = 0 # current position tracker + for i in range(0, rec_len): # slice over the bitstring + namebin = bitstr[pos:pos+name_part_bits].ljust(name_part_bits, '0') + pos += name_part_bits + numbin = bitstr[pos:pos+number_part_bits].ljust(number_part_bits, '0') + pos += number_part_bits + # now we only got binary representation of both parts + # let's convert them to bigints and then to the actual records + namefield = config['index'][i] + val_to_field(int(namebin, 2), + config['alpha'], effective_namelen) + numfield = val_to_field(int(numbin, 2), + config['digit'], config['numberlen']) + records.append([namefield, numfield]) + return records + +# main decoding method +def decode(records, config, expected=0): + alphabase = len(config['alpha']) + numbase = len(config['digit']) + effective_namelen = config['namelen'] - 1 + name_part_bits = int(effective_namelen * math.log2(alphabase)) + number_part_bits = int(config['numberlen'] * math.log2(numbase)) + bitstr = '' # bit string storage + for rec in records: # iterate over records + nameval = from_base(rec[0][1:], alphabase, config['alpha']) + numval = from_base(rec[1], numbase, config['digit']) + bitstr += format(nameval, '0b').zfill(name_part_bits) + bitstr += format(numval, '0b').zfill(number_part_bits) + # reconstruct the raw data from the bitstring and return it + datalen = int(math.ceil(len(bitstr)/8)) # estimate the data length in bytes + if expected > 0: # truncate if expected length is specified + datalen = expected + data = b'' # raw data placeholder + for i in range(0, datalen): # iterate over byte slices + ind = i << 3 + data += int(bitstr[ind:ind+8].ljust(8, '0'), 2).to_bytes(1, 'big') + return data + +def auto_int(x): # helps to convert from any base natively supported in Python + return int(x,0) + +if __name__ == '__main__': # main app start + from argparse import ArgumentParser + parser = ArgumentParser(description='Databankr: Casio Databank/Telememo record format encoder/decoder', epilog='(c) Luxferre 2024 --- No rights reserved <https://unlicense.org>') + parser.add_argument('mode', help='Operation mode (enc/dec)') + parser.add_argument('-t', '--type', type=str, default='bin', help='Data type (bin/hex, default bin)') + parser.add_argument('-i', '--input-file', type=str, default='-', help='Source input file (default "-", stdin)') + parser.add_argument('-o', '--output-file', type=str, default='-', help='Result output file (default "-", stdout)') + parser.add_argument('-c', '--config', type=str, default='config.json', help='Configuration JSON file path (default config.json in current working directory)') + parser.add_argument('-m', '--module', type=str, default='2515-lat', help='Module configuration code according to your watch (default 2515-lat)') + parser.add_argument('-l', '--expected-length', type=auto_int, default=0, help='Expected decoded data length in bytes (default 0 - no limits)') + args = parser.parse_args() + + # detect the mode + if args.mode == 'enc': + flow = 'enc' + elif args.mode == 'dec': + flow = 'dec' + else: + print('Invalid mode! Please specify enc or dec!') + exit(1) + + # load the configuration file + try: + f = open(args.config) + confdata = json.load(f) + f.close() + except: + print('Config file missing or invalid!') + exit(1) + + # load the module config + if args.module in confdata: + moduleconfig = confdata[args.module] + print('Loaded the configuration for %s' % moduleconfig['description']) + else: + print('Module configuration %s not found in the config file!' % args.module) + exit(1) + + # load the input data + try: + if args.input_file == '-': + infd = sys.stdin + else: + infd = open(args.input_file, mode='rb') + indata = infd.read() + if infd != sys.stdin: + infd.close() + except: + print('Error reading the input data!') + print(sys.exc_info()) + exit(1) + + # run the selected flow + if flow == 'enc': # encoding flow + if args.type == 'hex': # convert the input data if the type is hex + indata = bytes.fromhex(re.sub(r"[^0-9a-fA-F]", "" ,indata.decode('utf-8'))) + records = encode(indata, moduleconfig) + outdata = '' + for rec in records: # separate each record with double newline + outdata += rec[0] + '\n' + rec[1] + '\n\n' + else: # decoding flow + # parse the records + rawrecs = indata.decode('utf-8').split('\n\n') + records = [] # records will be stored here + for pairstr in rawrecs: + if len(pairstr) > 0: # exclude empty records + pair = pairstr.split('\n') # get raw pair and then left-adjust + records.append([pair[0].ljust(moduleconfig['namelen'], ' '), + pair[1].ljust(moduleconfig['numberlen'], ' ')]) + # decode the records + outdata = decode(records, moduleconfig, args.expected_length) + if args.type == 'hex': # convert the output data if the type is hex + outdata = outdata.hex() + # now, write the output file + try: + if args.output_file == '-': + outfd = sys.stdout + outfd.write(outdata) + else: + outfd = open(args.output_file, mode='wb') + if type(outdata) == 'str': + outdata = outdata.encode('utf-8') + outfd.write(outdata) + if outfd != sys.stdout: + outfd.close() + except: + print('Error writing the output file!') + print(sys.exc_info()) + exit(1) + + print('\nOperation complete')