Crazy Python Action

Most of the coders I know got interested in it because they love to tinker with things. As we progress in our careers, we often end up writing the same code over and over. The aim of this blog is to explore how to retain that sense of tinkering while still accomplish our day-to-day tasks. I will focus on teaching you to write code that future-you won't hate!

Your code is too DRY!

Published 10/23/2024

If you have ever let anyone else review your code, you have probably heard some variation of "This is repetitive" or "This code is not very DRY." We quickly learn that when a function or variable can be extracted we should do it! If we have copy-pasted the same code 20 times, chances are someone will modify one copy and forget to modify the other 19, making it harder to find the duplicated code in the future.

Even our IDEs highlight repetition now!

Read Full Article

Set attributes with JSONPath

Published 10/15/2024

JSONPath is an excellent way to simplify access of values deep within complicated JSON structures. It is widely supported in web programming languages as well as many databases.

I was recently given a task to to create an API and data structure for issuing arbitrary corrections to a JSON document, while keeping the original JSON unmodified. The size of each JSON documents made it very inefficient to store new copies of the document each time it was modified, so I needed to use a delta (change) format. I immediately thought of using JSONPath—path expressions for JSON documents—as we often use it to query our JSON. It is widely implemented and easily understood. JSONPath is not typically used to modify data in a JSON document, but I decided to give it a go.

Ultimately, I wanted to be able to store a list of corrections that could be applied to a target one-at a time in order. Each correction would be a combination of a JSONPath representing the target location, and a value to write at that location.

Storing both the corrections and the original document, retrieving the final version would be simple:

document = json.loads(target_json)

for correction in corrections:
    apply_correction(document, correction)

JSONPath expressions look like this:

'$' — the dollar sign represents the entire JSON document, or the root node of the document
'$.customer' — using dot notation, you can access individual values by their key
"$['customer']" — access the same key using bracket notation (dot- and bracket notation can even be mixed)
'$.line_items[0]' — using brackets, you can access individual list items by their 0-based index
'$.line_items[0].product_id' — there is no limit on nesting, allowing deep queries very easily
'$.line_items[*]' — this wills select all children of line_items, and is not exactly useful for our purpose of small corrections
'$.line_items[?@.qty > 5]' JSONPath also supports logical queries. This would select all line items with a quantity greater than 5. This type of filtering is also not useful for issuing corrections

The path expressions that can result in multiple results did not make sense in the scope of issuing document corrections, but the rest looked simple enough. In fact they looked exactly like Python's list-like and dict-like accessors.

Sample Data

Let's set up some sample data, representing a purchase order and two corrections:

target_json = '''{
    "customer": "Wright, Callahan and Hale",
    "order_timestamp": "2024-10-15T14:21:28.200830Z",
    "line_items": [
        {
            "product_id": "0493774426549",
            "qty": 3,
            "unit_price": "586.12"
        }
    ],
    "shipping_destination": {
        "address_1": "804 Martinez Walk Apt. 638",
        "city": "Thomasland",
        "state": "IN",
        "country": "USA",
        "postal_code": "43216"
    }
}'''

correction_1 = {
    'json_path': "$['line_items'][0]['qty']",
    'value': 6,
    'user': 'alyssagreen@example.net',
    'timestamp': '2024-10-15T16:27:25.255840Z',
}

correction_2 = {
    'json_target': '$.shipping_destination.address_1',
    'value': '904 Martinez Walk Apt. 638',
    'user': 'alyssagreen@example.net',
    'timestamp': '2024-10-15T16:28:24.356125Z',
}

Examining the corrections listed above, notice that correction_1 uses a JSONPath with bracket notation, and correction_2 uses a JSONPath with dot notation. We need to ensure that corrections with each type of path expressions can be applied.

Initial Implementation

With those examples in place, I took a first stab at implementing apply_correction():

import json

def apply_correction(target, correction):
    """
    Generates and executes a python assignment statement like:
        `target.line_item[0].qty = value`
    Modifies <target> in-place
    """
    json_path: str = correction['json_path']
    value = correction['value']
    python_path = json_path.replace('$', 'target', 1)
    exec(f'{python_path} = value', {}, locals())


document = json.loads(target_json)
apply_correction(document, correction_1)
assert document['line_items'][0]['qty'] == 6

Well I did mention that JSONPath closely mimics Python's own syntax! With a simple string replacement, we were able to turn our JSONPath into an exec()utable Python assignment. This method immediately brought us halfway to our goal. json.loads() by default gives us dict and list objects, which are mutable, passed by reference, and support bracket notation just like JSONPath!

Security note: this implementation uses exec() which should generally be avoided. exec() and eval() can open up large security holes and introduce performance issues. They require safe, known inputs. I believe it makes this particular code very succinct but requires care to maintain safety..

Handling dot notation

Unfortunately, attempting to apply our second correction, which uses dot notation, using our initial implementation fails:

>>> apply_correction(document, correction_2)
Traceback (most recent call last):
  File "/opt/.pycharm_helpers/pydev/pydevconsole.py", line 364, in runcode
    coro = func()
           ^^^^^^
  File "<input>", line 1, in <module>
  File "<input>", line 5, in apply_correction
  File "<string>", line 1, in <module>
AttributeError: 'dict' object has no attribute 'shipping_destination'

The executed Python statement target.shipping_destination.address_1 = '904 Martinez Walk Apt. 638' fails because target is a dict and has no attributes. What we want is for the attribute access .shipping_destination to access a key in that dictionary, but Python does not do that automatically.

We need a class that will give us both key- and attribute-like access. We could start with a dict and add attribute access, or start with one of the many attribute-based data classes available in Python (dataclasses, SimpleNamespace, namedtuple...) and add key-like access. SimpleNamespace is an often-overlooked utility baked right into the Python standard library and turns out to fit this usage quite well. SimpleNamespace takes keyword arguments (think dict) to its constructor and makes them accessible as attributes. We can add __getitem__ and __setitem__ implementations to make it also act like a dict.

from types import SimpleNamespace

class MutableJsonPath(SimpleNamespace):
    def __getitem__(self, item):
        return getattr(self, item)

    def __setitem__(self, key, value):
        setattr(self, key, value)

Usage

>>> target = MutableJsonPath(foo='bar', inner=MutableJsonPath(hello='world'))
>>> target.foo
'bar'
>>> target['foo']
'bar'
>>> target.inner.hello
'world'
>>> target['inner']['hello']
'world'
>>> target['inner']['hello'] = '🌍'
>>> target.inner.hello
'🌍'

Final Implementation

Now we need to ensure that when we load from JSON, we use our new MutableJsonPath class instead of the default dict type. To customize JSON loading we use the object_hook parameter to json.loads. Unfortunately the default constructor for SimpleNamespace is not compatible with object_hook so we need a small addition to our class:

class MutableJsonPath(SimpleNamespace):
    @classmethod
    def from_dict(cls, d: dict):
        """Constructor, compatible with the object_hook param to json.loads"""
        return cls(**d)

    def __getitem__(self, item):
        return getattr(self, item)

    def __setitem__(self, key, value):
        setattr(self, key, value)


document = json.loads(target_json, object_hook=MutableJsonPath.from_dict)

The final line shows how to apply the new class when loading a JSON document from a string. The resulting document will be an instance of MutableJsonPath, as will all nested objects within the original JSON. Arrays from the JSON will still be Python lists, and values will still have their normal Python types.

All together now

With our new class in place, we can now load and apply our two sample corrections!:

import json
from types import SimpleNamespace

target_json = '''{
    "customer": "Wright, Callahan and Hale",
    "order_timestamp": "2024-10-15T14:21:28.200830Z",
    "line_items": [
        {
            "product_id": "0493774426549",
            "qty": 3,
            "unit_price": "586.12"
        }
    ],
    "shipping_destination": {
        "address_1": "804 Martinez Walk Apt. 638",
        "city": "Thomasland",
        "state": "IN",
        "country": "USA",
        "postal_code": "43216"
    }
}'''

correction_1 = {
    'json_path': "$['line_items'][0]['qty']",
    'value': 6,
    'user': 'alyssagreen@example.net',
    'timestamp': '2024-10-15T16:27:25.255840Z',
}

correction_2 = {
    'json_path': '$.shipping_destination.address_1',
    'value': '904 Martinez Walk Apt. 638',
    'user': 'alyssagreen@example.net',
    'timestamp': '2024-10-15T16:28:24.356125Z',
}


class MutableJsonPath(SimpleNamespace):
    @classmethod
    def from_dict(cls, d: dict):
        """Constructor, compatible with the object_hook param to json.loads"""
        return cls(**d)

    def __getitem__(self, item):
        return getattr(self, item)

    def __setitem__(self, key, value):
        setattr(self, key, value)


def apply_correction(target, correction):
    """
    Generates and executes a python assignment statement like: `target.line_item[0].qty = value`
        Modifies <target> in-place
    """
    json_path: str = correction['json_path']
    value = correction['value']
    python_path = json_path.replace('$', 'target', 1)
    exec(f'{python_path} = value', {}, locals())


document = json.loads(target_json, object_hook=MutableJsonPath.from_dict)
apply_correction(document, correction_1)
assert document['line_items'][0]['qty'] == 6

apply_correction(document, correction_2)
assert document['shipping_destination']['address_1'] == '904 Martinez Walk Apt. 638'

print(document)

Caveats and Improvements

Of course this implementation is not ready for production, as it was made as a quick and dirty prototype. It is a special-purpose, limited implementation of JSONPath. It is ready for many improvements.

Input to this function should be validated to ensure security.
Converting JSONPath expressions to executable Python helped us reach our goal quickly and generically in this case, but an alternative data structure for expressing the target path might be better for many domains.
More robust code might include making a deep copy of the decoded JSON document prior to applying corrections.

Read Full Article

Practical guide to figurative constants in RPG

Published 10/11/2024

What exactly are BLANKS, ON, *OFF, etc? I see these used in RPG all the time and can usually tell what they mean, but I needed to know more details so here they are.

First, it is helpful to remember that string comparisons in RPG have a quirk compared to most languages on other platforms. When comparing strings of different lengths, the shorter string is right-padded with blanks (spaces) up to the length of the longer. Thus the two strings being compared are always equal-length. This means the two variable length strings varcharA = ''; and varcharB = ' '; will compare as equal. Likely this quirk is due to its origins with only fixed-length (unterminated) strings available in the language.

Since this is a Python blog after all, let's see what this comparison would look like in Python:

import functools

@functools.total_ordering  # fill in unimplemented comparison functions
class RpgStr:
    def __init__(self, val: str = None):
        """Decorator for a normal python str, implementing RPG-like comparison function"""
        self.val = val or ''

    def __len__(self):
        return len(self.val)

    def __getattr__(self, item):
        """Pass-through to support other normal str functionality"""
        return getattr(self.val, item)

    def __eq__(self, other):
        return self.ljust(len(other)) == other.ljust(len(self))

    def __lt__(self, other):
        return self.ljust(len(other)) < (other.ljust(len(self)))


a = RpgStr('abc')
b = RpgStr('abc' + '     ')

print(a == b)  # True

Knowing the above regarding string comparisons makes it easy to understand how these "figurative constants" work in RPG. Each represents a number or character set that will be repeated to build a string for use in comparisons and assignments. The length of the constant will be equal to the length of the other "side" of the comparison or assignment.

Figurative Constants

Here are the available figurative constants:

*ALL'x...': repeats the characters in the single-quotes up to the specified length. Variations *ALLG, *ALLU, and *ALLX exist for repeated graphic, unicode, and hexadecimal literals.
*BLANK / *BLANKS: repeats a blank (' ') character up to the specified length, usually equivalent to *ALL' '. Valid only for string types.
*ZERO / *ZEROS: repeats a zero ('0') character up to the specified length, usually equivalent to *ALL'0'.
*HIVAL, *LOVAL: represents the maximum or minimum value representable by a numeric or date/time type. examples 99 and -99 for type packed(2,0). Interestingly, the lo/hi value for date/times may vary based on the settings for the DATFMT or TIMFMT.
*NULL represents a null value for pointers (not the same as NULL in SQL)
*ON and *OFF map to '1' and '0' respectivally and are valid for character types and indicators (indicators are essentially char(1) values which must be '1' or '0')

Note: there is no difference between the singular and plural forms (e.g. *BLANK and *BLANKS); they are synonyms for convenience

Examples for fun

The following code is ready to compile and call:

**free

dcl-s vStr varchar(10) inz('');
dcl-s vInitBlankStr varchar(10) inz(*blanks);  // will have %len() = 10
dcl-s vLongStr varchar(24) inz('');
dcl-s vTenDigit packed(10) inz(0);
dcl-s vStrArray varchar(10) dim(10);

dcl-s vPrompt varchar(25) inz('Press Enter');
dcl-s vResult varchar(1) inz('');                 

// compare empty string to *blanks, *blanks will be ''
if vStr = *blanks;
  dsply 'vStr = *blanks evals TRUE, with length 0';
endif;

// compare long blank string to *blanks, *blanks will be '          '
if vInitBlankStr = *blanks;
  dsply 'vInitBlankStr = *blanks evals TRUE, with length 10';
endif;

// compare different-length strings. shortest will be padded
if vStr = vInitBlankStr;
  dsply 'vStr = vInitBlankStr evals TRUE, with diff lengths';
endif;

// assign blanks to a variable string type, will be padded to the max size
vStr = *blanks;
dsply ('length of vStr: ' + %len(vStr));  // this will display 10

// assign a repeating sequence to a str
vLongStr = *ALL'xyzzy';
dsply ('vLongStr = ' + vLongStr);  // 'xyzzyxyzzyxyzzyxyzzyxyzz'

// assign repeating digits to a number
vTenDigit = *ALL'72';
dsply ('vTenDigit = ' + vTenDigit);  // '7272727272'

// can I turn a string to all ones?
vStr = *ON;
dsply vStr;  // yep; '1111111111'

// fill an entire array with a figurative constant?
vStrArray = *ALL'123';
dsply ('vStrArray(2) = ' + vStrArray(2));  // each array value is now '1231231231'

*inlr = *on;
dsply vPrompt '' vResult;   // pause at the end of execution

Running the example Code

RPG code runs on IBM i systems. Paste the above code in a source member called CHKCONST in <yourlib>/QRPGLESRC and use the following to compile/run:

Compile: CRTBNDRPG PGM(<yourlib>/CHKCONST) REPLACE(*YES)

Run: CALL <yourlib>/CHKCONST

Recommendations

After looking at many examples of these constants in use, and playing with them, I have these recommendations:

Take advantage of checking whether a string is "empty" with vStr = *blanks, whether the string length is variable or fixed! In other languages, we often must write some variation of %rtrim(vStr) = '' and the "RPG way" is cleaner.
Remember that assigning *blanks to a variable length string will waste memory. Use vStr = '' for a zero-length string. You can use this for fixed-length strings too, and it will still blank the entire string.
If you need to compare BOTH length and content of two strings, you must explicitly do so: if %len(string1) = %len(string2) and string1 = string2. Putting the length comparison first is likely best for both readability and short-circuit efficiency.
In free-form (modern) RPG, I see no reason to use *ZERO for numeric types, as opposed to just a literal 0.
When setting indicator variables, always use *ON or *OFF as the value (never the literals 1 or 0). This will help future-proof your code.
If you are using comparisons to *ON or *OFF you can stop: if *in45 = *ON; is the same as if *in45;. In fact the expression *in45 = *ON internally evaluates one of the literals *ON or *OFF. You can use this in assignments: after the statement isAwesome = ('RPG' = 'Awesome') the indicator/char isAwesome will contain *OFF.

Find out more:

Read Full Article