Sunday, 30 November 2025

exec, eval, return and ruby

In this post about expressions I mentioned that Kotlin features return expressions, which is a rather surprising feature. Let's see it in action:


// Kotlin code:
fun getCapital(country: Country) {
     val city = country?.capital ?: return "Paris"
     // this won't run if we could not find a capital
     logger.log("We've found the capital")
     return city
}

Contrary to try or throw expressions, that can be simulated (in JavaScript, Python...) with a function [1], [2], there's no way to use a "return() function" to mimic them (it would exit from that function itself, not from the calling one). Well, it came to my mind that maybe we could use a trick in JavaScript with eval() (I already knew that it would not work in Python with exec()), but no, it does not work in JavaScript either.


// JavaScript code:
function demo() {
    eval("return 42;");
    console.log("This will never run");
}

console.log(demo());
// Output: SyntaxError: Illegal return statement


JavaScript gives us a SyntaxError when we try that because that return can not work in the way we intend (returning from the enclosing function) so it prevents us from trying it. The code that eval compiles and runs is running inside the eval function, it's not as if it were magically placed inline in the enclosing function, so return (or break, or continue) would just return from eval itself, not from the enclosing function, and to prevent confusion, JavaScript forbids it.

The reason why I thought that maybe this would be possible is because as I had already explained in this previous post JavaScript eval() is more powerful than Python exec(), as it allows us modifying and even adding variables to the enclosing function. As a reminder:


// JavaScript code:
function declareNewVariable() {
    // has to be declared as "var" rather than let to make it accessible outside the block
    let block = "var a = 'Bonjour';";
    eval(block);
    console.log(`a: ${a}`)
}

declareNewVariable();
// a: Bonjour


This works because when JavaScript compiles and executes a "block" of code with eval() it gives it access to the scope chain of the enclosing function.

Python could have also implemented this feature, but it would be very problematic in performance terms. Each Python function stores its variables in an array (I think it's the f_localsplus attribute of the internal frame/interpreter object, not to be confused with the higher level PyFrameObject wrapper), and the bytecode access to variables by index in that array (using LOAD_FAST, STORE_FAST instructions), not by name . exec() accepts an arbitrary dictionary to be used as locals, meaning that it will access to that custom locals or to the one created from the real locals, as a dictionary lookup (with LOAD_NAME, STORE_NAME). Basically there's not an easy way to reconcile both approaches. Well, indeed exec() could have been designed as receiving by default a write-through proxy like the one created by frame.f_locals. That would allow modifying variables from the enclosing function, but would not work for adding variables to it (see this post). So I guess Python designers have seen it as more coherent to prevent both cases rather than having one case work (modification of variable) and another case not (addition of a new variable). As for the PyFrameObject stuff that I mention, some GPT information:

In Python 3.11+, the local variables and execution state are stored in interpreter frames (also called "internal frames"), which are lower-level C structures that are much more lightweight than the old PyFrameObject.
When you call sys._getframe() or use debugging tools, CPython creates a PyFrameObject on-demand that acts as a Python-accessible wrapper around the internal frame data. This wrapper is what you can inspect from Python code, but it's only created when needed.

So all in all we can say (well, a GPT says...)

Bottom line: Neither Python’s exec() nor JavaScript’s eval() can magically splice control-flow into the caller’s code. They both create separate compilation units. JavaScript feels “closer” because eval() shares lexical scope, but the AST boundaries still apply.

After all this, one interesting question comes up, is there any language where the equivalent to eval/exec allows us returning from the enclosing function? The answer is Yes, Ruby (and obviously it also allows modifying and adding new variables to the enclosing function). Additionally notice that ruby also supports return expressions (well, everything in ruby is an expression).


# ruby code:
def example
  result = eval("return 5")
  puts "This won't execute"
end

example  # returns 5

Ruby's eval is much more powerful than JavaScript's or Python's - it truly executes code as if it were written inline in the enclosing context.

The "as if" is important. It's not that Ruby compiles the code passed to eval and somehow embeds it in the middle of the currently running function. That could be possible I guess in a Tree parsing interpreter, modifying the AST of the current function, but Ruby has long ago moved to bytecode and JIT. What really happens is this

Ruby's eval compiles the string to bytecode and then executes it in the context of the provided binding, which includes:

- Local variables
- self (the current object)
- The control flow context (the call stack frame)

That last part is key. When you pass a Binding object, you're not just passing variables - you're passing a reference to the actual execution frame. So when the evaled code does return, break, or next (Ruby's continue), it operates on that captured frame. Here's where it gets wild.

The Binding object idea (an object that represents the execution context of a function) is amazing. By default (when you don't explicitly provide a binding object) the binding object represents the current execution frame, but you can even pass as binding the execution frame of another function!!! You can get access to variables from another function, and if that function is still active (it's up in the call stack) you can even return from that function, meaning you can make control flow jump from one function to another one up in the stack chain!

eval operates on a Binding object (which you can pass explicitly), and that binding captures the complete execution context - local variables, self, the surrounding scope, everything. You can even capture and pass bindings around

Just notice that Python allows a small subset of the binding object functionality by allowing us to explicitly provide custom dictionaries as locals and globals to exec().

Sunday, 23 November 2025

How Exceptions Work

It's been quite a while since I first complained about the lack of safe-navigation and coalesce operators in Python, and provided a basic alternative. I've also complained about the lack of try-expressions in Python, and also provided a basic alternative. Indeed, there's not a strong reason for having 2 separate functions, I think we can just use do_try for the safe-get and coalesce option.



def do_try(action: Callable, exceptions: BaseException | list[BaseException] | None = Exception, on_except: Any | None = None) -> Any:
    """
    simulate 'try expressions'
    on_except can be a value or a Callable (that receives the Exception)
    """
    try:
        return action()
    except exceptions as ex:
        return on_except(ex) if (on_except and callable(on_except)) else on_except

person = Person()
embassy = do_try(lambda: person.country.main_cities[0])


I also complained of how absurd it feels having a get method for ditionaries but not for collections. That means that we end up writing code like this:


x = items[i] if len(items) > i else "default"

Of course, that would not be necessary if we had safe-navigation, but as we don't have it, we can just use the do_try function:


x = do_try(lambda: items[i], "default")

And here comes the interesting part, obviously using do_try means using try-except under the covers, which when compared to using an if conditional seems something to avoid in performance terms, right? Well, I've been revisiting a bit the internals and cost of exceptions. Since version 3.11 Python has zero-cost exceptions. This means that (as in Java) having try-except blocks in your code does not have any performance effect if no exception is thrown/raise, the only costs occur if an exception is actually raised: The "zero cost" refers to the cost when no exception is raised. There is still a cost when exceptions are thrown.

Modern Python uses exception tables. For each function containing try-except blocks an exception table is created linking the try part to the handling code in the except part. Exception tables are created at compile time and stored in the code object. Then at runtime if an exception occurs the interpreter will consult the exception table to find the handler for the given exception and jump to it. Obviously creating an exception object, searching the exception table and jumping to the handler has a cost. Given that in Python compilation occurs when we launch the script, just before we can run the code, we can say that this exception table creation also has a runtime cost, but it's mininum as it happens only once per function (when the function is created), not every time the function is executed. That's where the cost happens if an exception is raised: creating the Exception object, unwinding and jumping to the handler.

Throwing/raising an exception felt like a low level mechanism to me, but it's not at all.

Language-level exceptions are software constructs managed by the runtime (JVM for Java, CPython for Python). They do not involve the OS unless the program crashes. So when you use a throw/raise statement in your code there's not any sort if software interrupt, it's just one more (or several) instruction. The python interpreter will come across a RAISE_VARARGS bytecode instruction, and it will search in the exception table for the current function and/or the functions in the call stack, trying to find an exception handler.

Notice that the same happens in Java-JVM. The Java Compiler creates an exception table for each method and stores it in the .class file. This table maps bytecode ranges to handlers (catch blocks) and the type of exception they handle. When the class loader loads the class the JVM stores this table in the method’s metadata.. Given that the JVM comes with JIT compilation, there's an additional level. When the JIT compiles the method:

The JIT generates native machine code for the method.
It also creates a new exception table for the compiled code, because:
The original bytecode offsets are no longer relevant.
The JIT needs to map native instruction addresses to handler entry points.

This table is stored alongside the compiled code in the JVM’s internal structures.

So once a method has been compiled by the JIT at runtime we'll have two exception tables, the initial one for the bytecode form of the method (that is kept around in case we have to deoptimize from native back to bytecodes), and the table for native code. Notice that when the JIT compiles the bytecodes to native code we'll incur a very small extra cost for the creation of this additional table.

With all the above, using do_try() for safe indexed access seems a bit overkill (unless we're sure that access is very rarely going to fail and throw), and having a specific commodity function for it makes sense:


def get_by_index(sequence: Sequence[T], index: int, default: Any = None) -> Any:
    """
    Safely access an element by index in a sequence, where sequence is any class supporting __getitem__ and __len__.
    like: list, str, tuple, and bytes
    Usage: get_index(my_list, 2, default='Not Found')
    """
    return sequence[index] if 0 <= index < len(sequence) else default
    

We could generalize the function for nested access, but once we start to loop with if conditions at some number of iterations the try-except will probably end up being better for performance.

Friday, 14 November 2025

FICXixon 2024

I guess as you get older (OK, let's say mature, to sound nicer) traditions become more and more important. One more year, and one more edition of FICXixón is almost here, and as usual, I realise I have not published yet my post about the previous edition, so here it goes:

FICXixon 62 edition took place from November 15th to 23rd. This was a dry November month here, which makes me rather angry, I love rainy weather, I've always loved it, but now even more probably due my youth memories (and I guess my Galician heritage also plays its part). All in all I attended 6 films, which is my record since I returned to Xixón in 2018. This year I was not so busy at work, and had more time to check the programme and attend screenings. I watched 2 excellent films, a good film, an interesting documentary, and 2 films that were OK, but I would not watch again. From the micro-reviews below you can guess who is who.

  • The Antique, Friday 15, 19:00, OCine. The best film I've watched in this edition. I was quite in doubt between this one and Bird, but the trailer of "The Antique" looked so good (additionally, a story set in Russia is more appealing to me now that one set in UK), and in the end I settled on this gorgeous Georgian film. An old flat in a historical building in central Saint Petersbourg, the snow covered streets, an old man approaching the end, a gorgeous Georgian woman, antiquities, social unrest. I think I've said enough
  • Una Ballena, Saturday 16, 22:00, Teatro Jovellanos. Basque film mixing neo-noir and horror-fantasy. It had excellent reviews, but I'm not sure why it did not work for me. There was an "Encuentro con el público" after the screening, where the director and the main actress, the beautiful Ingrid, would discuss the film with the audience, some people expressed how (positively) shocked they were with the film, and honestly I felt a bit out of place.
  • La prisionnere de Bordeaux, Tuesday 19, 21:30, Yelmo Ocimax. When FICXixón 2022 dedicated a retrospective to Patricia Mazuay I watched three of her films, and I loved 2 of them. And I loved even more the "Encuentros con el público" with her. She was so funny and expressive. So when I found out that they were programming her last film I could not miss it, furthermore when it stars the charming Isabelle Huppert, one of my favorite actresses. So I even took the bus to go to the Yelmo cinema located far away in the outskirts, something I'd never done before for a FIC film! I did not regret it, the film is the kind of bizarre, funny, melancholic product that you could expect from these 2 crazy women.
  • When the Light Breaks, Wednesday 20, 22:00, Teatro Jovellanos. Excellent Icelandic drama. Grieving for a loved one is even worse when you are (and he was) young, and even worse when you have to hide how broken you are cause you had a secret relation. The light sequences at the start and the end of the film are mesmerizing.
  • Que se sepa (Indarkeriaren pi(h)artzunak), Thursday 21, 22:00, Escuela de Comercio. Interesting and necessary Basque documentary about one more of the many episodes of sorrow and pain brought up by the long and bloody conflict in the Basque Country. This time ETA and the Spanish Government join forces to kill an innocent man and destroy his family.
  • Fréwaka, Saturday 22, 19:15, OCine. It was preceded by a Colombian short, "La noche del Minotauro". Irish horror film, it was entertaining I think, but had little effect on me, so little that 1 year after I hardly remember anything about it.

Sunday, 9 November 2025

CGNAT

I've got a rather basic network knowledge and I've lately come across a problem/limitation I was not aware of and that I think is increasingly common, CGNAT. With my Internet Provider (Telecable) in Asturies, my FTTH router (a nice ZTE F6640) has a stable IP. I mean, it's not static, but it rarely changes (even after rebooting it). So when I recently felt that it could be convenient for me to occasionally connect to one of the computers in my LAN from outside, I thought it would be feasible.

So let's say I want to be able to ssh into my RasPI 5 from downtown while I discuss with my friends about how woke ideology is destroying humanity. The DHCP server in my router is configured to provide a static IP to all significant devices in my LAN, let's say 192.168.1.5 for my rasPi5. To make the port 22 in my rasPi accessible from outside I have to configure port forwarding in my router. It's just a matter of telling the router "forward incoming connections to one of your ports (let's say 10022) to port 22 in 192.168.1.5". I'd never done it before, but seems like something that has existed for decades and should work. So I connected my laptop to my mobile phone hotspot, to simulate the "I'm on the outside world thing", and tried. And tried, and tried... to not avail.

Checking some forums with similar questions involving other Internet providers in Spain I came across this fucking technology: CGNAT

Carrier-grade NAT (CGN or CGNAT), also known as large-scale NAT (LSN), is a type of network address translation (NAT) used by ISPs in IPv4 network design. With CGNAT, end sites, in particular residential networks, are configured with private network addresses that are translated to public IPv4 addresses by middlebox network address translator devices embedded in the network operator's network, permitting the sharing of small pools of public addresses among many end users. This essentially repeats the traditional customer-premises NAT function at the ISP level.

My internet provider in Asturies continues to use IPv4 (that's not the case in France, where to my surprise I found recently that it's using IPv6), and given that it has not enough public IP addresses for all its customers, it's adding an extra NAT (Network Address Translation) Layer.

I had got my router public address using curl ident.me, that gave me a nice and public 85.152.xxx.yyy address, but if I connect to my fiber router and check in it, I see a different one: 100.102.x.y. Well, that's not a public IP, and an indicator that my ISP is using CGNAT, as explained here.

If it's any of the following, then your router doesn't have a public IP address:

  • 192.168.x.x
  • 10.x.x.x
  • 172.16.x.x through 172.31.x.x
  • 100.64.x.x through 100.127.x.x

The last one is usually indicative of your ISP using CGNAT.

Summing up, my laptop has a 192.168. private IP address. My fiber Router faces the outside world with another private IP address (100.102.). Me and other customers in my area are connected to another upstream router in my ISP network, and this one faces the outside world with the 85.152.xxx.yyy public IP that I can see with ident.me. So in order for the connection from the outside to my RasPi to work I would also have to set up port-forwarding in that upstream ISP router shared with my "neighbours". So, no way...

Well, there's another way (that I have not tried) to set up this, a sort of reverse approach. In the last year I've been using SSH tunnels to connect to some non public servers at work through a "Bastion" work server with a public IP. With a standard SSH tunnel I basically create a SSH connection to that Bastion server telling it (to the Bastion server) that any connection that goes through it (through that "tunnel") has to be forwarded to another server. There are also reverse SSH Tunnels, where I create a SSH connection to a server (a tunnel) telling that server that any connections it receives to a certain port have to be forwarded to "me" through that tunnel, to a certain port on my machine. So if you have a server on the internet (Azure, AWS...) you could use it to create a reverse SSH tunnel to your PC located behind CGNAT. All this is explained for example here.

Tuesday, 28 October 2025

SqlAlchemy Registry

After talking about Persistence Ignorance and mapping styles in SqlAlchemy in my previous post, it's time now to take a look to an interesting technique used by the Declarative mapping. Whatever mapping style you use, SqlAlchemy relies on a registry where the information that maps entities to tables, fields to properties, relations, etc is stored. When using the Imperative mapping you directly work with that registry (sqlalchemy.orm.registry)


from sqlalchemy.orm import registry
@dataclasses.dataclass
class Post:
    title: int
    content: str

metadata = MetaData()
mapper_registry = registry(metadata=metadata)

mapper_registry.map_imperatively(
	entities.Post,
	table_post,
	properties={
		#"post_id": table_post.c.PostId, 
		"title": table_post.c.Title,
		"content": table_post.c.Content
	}      
)

But when using the declarative mapping you're not aware of that registry as you normally don't interact with it at all, though notice that you still have access to it through the Base class.


class Base(DeclarativeBase):
    pass
	
class Post(Base):
    __tablename__ = "Posts"
    post_id: Mapped[int] = mapped_column("PostId", primary_key=True, autoincrement=True)
    title: Mapped[str] = mapped_column("Title")
    content: Mapped[str] = mapped_column("Content")
	
# we have not directly used the registry at all in the above code, but it's still there, accessible through the Base class:
print(f"{Base.registry=}")
# Base.registry=
    

So how does the registry get set? Well, your entities get registered in that registry by leveraging the inheritance and metaclasses machinery to obtain a behaviour that is similar to the ruby inherited hook. Remember that I already talked in a previous post about simulating another ruby metaprogramming hook, the method_added hook, by means of metaclasses. We can use metaclasses to execute some action each time a class based on that metaclass is created (putting that code to be executed in the __new__ or __init__ methods of the metaclass). In our case, we want to execute code to add each model class to the registry. For that the Base class that we define for our entities must have DeclarativeMeta as its metaclass. We can do this by directly setting ourselves the metaclass and the registry instance:


mapper_registry = registry()

class Base(metaclass=DeclarativeMeta):
    registry = mapper_registry


Or by inheriting from DeclarativeBase (that already has DeclarativeMeta as its meta). DeclarativeBase will also take care of setting the registry in our Base class.



class Base(DeclarativeBase):
    pass


We can take look at the DeclarativeMeta code to see how it makes its magic:


class DeclarativeMeta(DeclarativeAttributeIntercept):
    metadata: MetaData
    registry: RegistryType

    def __init__(
        cls, classname: Any, bases: Any, dict_: Any, **kw: Any
    ) -> None:
        # use cls.__dict__, which can be modified by an
        # __init_subclass__() method (#7900)
        dict_ = cls.__dict__

        # early-consume registry from the initial declarative base,
        # assign privately to not conflict with subclass attributes named
        # "registry"
        reg = getattr(cls, "_sa_registry", None)
        if reg is None:
            reg = dict_.get("registry", None)
            if not isinstance(reg, registry):
                raise exc.InvalidRequestError(
                    "Declarative base class has no 'registry' attribute, "
                    "or registry is not a sqlalchemy.orm.registry() object"
                )
            else:
                cls._sa_registry = reg

        if not cls.__dict__.get("__abstract__", False):
            _ORMClassConfigurator._as_declarative(reg, cls, dict_)
        type.__init__(cls, classname, bases, dict_)
from sqlalchemy.orm import DeclarativeBase


I'm involved in some projects where the Database is not a critical element. We don't retrieve data from it, we just use it as an additional storage for our results, but the main store for those results are json/csv files. This means that if the Database is down, the application should run anyway. So it's important for me to have clear what things involve database access (and hence an error if the DB is not accessible), and also when the model mapping will throw an error if the mapping is incorrect. Let's see:

  • Adding classes to the registry (either explicitly with the imperative mapping or implicitly with the declarative one) does not perform any check with the database (so if the DB is down or there's something wrong in our mapping, like wrong tables or columns, we won't find it until later).
  • Creating a SqlAlchemy engine does not perform any connection to the DB either.
  • Creating a Session does connect to the Database, but it does not perform any model verification.
  • Adding objects to a Session won't check the model until the moment when you do a flush or a commit (that indirectly performs a flush).
  • Performing a select Query through a Session will obviously generate an error if any of the mappings for the tables involved in the query is wrong.

invoke operator function


Friday, 24 October 2025

Persistence Ignorance

I've used SqlAlchemy in some projects (basic use, projects where the Database is just one of multiple datasources), and until recently I'v been sticking to using the Imperative mapping style. I grew up as a developer with Persistence Ignorance (PI) as a guiding principle (keep your domain model free from infrastructure concerns like database access, so it remains clean, testable, and focused on business logic), so that was the natural thing to me, and I was really surprised to see that SqlAlchemy recommends to use the Declarative mapping style, where the entities are totally aware of the specific persistence mechanism. .Net Entity Framework and NHibernate make a good job in allowing us to have entities that are "almost" persistent ignorant. I say "almost" cause if you check this list of things that go against Persistence Ignorance, you'll recognize some entity framework requirements like parameterless constructor and using virtual properties for lazy loaded relations. You can have all the additional constructors that make sense for your entities, EF just needs this parameterless one as it will initialize your entities by calling it and then setting properties as needed. As for the virtual properties, EF implements lazy-loading by means of creating proxy classes. If you have a Country entity with a lazy-loaded navigation property cities, EF will create a Proxy class that inherits from Country and overrides the cities property implementing there the lazy-loading logic.

Using the imperative mapping in SqlAlchemy gives you even more freedom. Your entities can have any constructor, as SqlAlchemy leverages Python's __new__ and __init__ separation so that it does not invoke __init__ for initializing the entities, but set attributes one by one. Then the dynamic nature of the language means that you don't have to mark in any special way properties corresponding to lazy loaded relationships and it does not need to resort to proxy classes to implement lazy loading, as it leverages Python dynamism and lookup logic. I think for each lazy relation in an entity a Descriptor is added to the class. When you first try to access the corresponding attribute the lookup will reach the Descriptor, that will perform the corresponding query and set the result in an attribute of the instance, so that the next time that you access the relation, the values will be retrieved from the instance. I guess this is more or less related to what I discuss here.

All this said, we should also note that (as explained here) there's still some "persistence leakage" into your entities when using the Imperative mapping. While you define your entity classes fully unaware of the persistence, SqlAlchemy (when adding them to the registry) makes them aware of the persistence mechanism by adding different attributes at the class level and at the instance level. For example attributes like _sa_instance_state or _sa_lazy_loader (these are part of SQLAlchemy’s internal machinery to track state and identity, manage lazy loading and relationship resolution and hook into attribute access dynamically). So your entities become bloated with extra attributes that you don't use on your own, and if you serialize them to json or whatever, they'll show up.

In the end I've ended up having separate Model entities (that use the declarative mapping) and Domain entities (that know nothing about the database) and mapper classes/functions that map Model entities to Domain entities and viceversa. This gives you almost full PI. I say almost cause you still end up with table ID's leaking into you Domain entities, but this is a more than acceptable compromise. Anyway, you still could get rid of it by declaring your Domain entities without the ID (Countr class) but declaring additional child entities (CountryIdAware class) that incorporate the ID. Your Model to Domain mappers will indeed create CountryIdAware instances that will be passed to your Domain, but the Domain will we aware of them just as User instances, it won't see the ID attribute.

Sunday, 12 October 2025

Truffle Bytecode DSL

I have a fascination with Graal and the Truffle interpreters framework, though it's all from a theoretical point, as I've never built an interpreter myself. The thing is that recently I've found out about a new addition to Truffle, the bytecode DSL. This means that Truffle supports now the 2 main type of interpreters: Tree Parsing Interpreters and bytecode interpreters.

I found this a bit odd at first, as it was not clear to me how to reconcile this with what I understood as the main super-powers of Truffle. The "traditional" approach in Truffle is writing Tree Parsing Interpreters (AST interpreters). Summarizing what I explain in some of my previous posts, the nodes in this Tree correspond to java methods (Java bytecodes) that the interpreter invokes. These nodes can get specialized to more specific nodes thanks to profiling, and then when a guest language method is hot, the Java bytecodes for the nodes making up that method are sent to Graal for compiling it to native code (this is the Partial Evaluation part). The equivalent to specializing the AST nodes also exists for the bytecodes case, those bytecodes can be specialized/quickened in a way very similar to what the Python Adaptive Specializing Interpreter does. But for the compilation part, if with the bytecode DSL we no longer have a tree made up of nodes, are we missing the Partial Evaluation magic?

No, we are not missing anything. For each Guest language bytecode of the program we are executing we'll have a Java method that executes it. When a guest language method is hot, the Java methods for the bytecodes making up that method will be sent to Graal for compilation, so this is the same we do with AST nodes.

Confirming this with a GPT has provided me with a better understanding of the Partial Evaluation and optimizations. When a method is hot and it's nodes (or guest language bytecode) are sent for compilation Truffle can decide to send only part of the method, not the whole method (path specific compilation). When Truffle does partial evaluation, it traces the actual execution path that was taken during profiling. This means that if we have an "if-else" and the profiling shows that the condition is always true it will only send for compilation the "if part". Of course it adds guards so that if the assumptions taken becomes false it can deoptimize the code (transfer back to the interpreter)

There's an additional element in how Truffle can achieve such excellent performance, inlining (both for AST and bytecode interpreters). When Truffle sends the java methods for the nodes or bytecodes of a method (or of part of a method based on optimizations) for compilation, it will also send those methods called from that method, and will inline them in the generated native code.

A common taxonomy of JIT compilers is Method-based (Graal, HotSpot, .Net...) vs Tracing JITs (LuaJIT, older TraceMonkey).

Method-Based JITs (like Graal/Truffle, HotSpot, .NET)

  • Compilation unit: entire method
  • When a method becomes hot, compile it
  • Can still do path specialization within that method
  • Inlining: pulls called methods into the caller during compilation

Tracing JITs (like LuaJIT, older TraceMonkey)

  • Compilation unit: hot trace across multiple methods
  • Traces execution through method calls, loops, returns
  • The "trace" might start in methodA(), call into methodB(), and return - all one compiled unit
  • More aggressive cross-method optimization

The interesting thing is that while Graal is primarily method-based, with very aggressive inlining it can achieve trace-like behavior.

Pure tracing JITs can cross method boundaries more naturally, but modern method-based JITs like Graal blur this distinction through aggressive inlining. The end result can be quite similar, just with different conceptual models!

Example:

Tracing JITs Hot loop detected spanning multiple methods

  • Record exact execution path
  • Compile: loop_header → methodA → methodB → loop_back
  • One flat piece of native code

Graal methodA is hot

  • Compile methodA
  • Inline methodB call
  • Inline methodC call
  • Result looks similar but structured around methodA