Wednesday, 27 March 2013

Czechoslovakia, part I

Don't worry, it's not that you've gone back in time 40 years, nor that I'm unaware of the collapse of the "Soviet block", I just think the title really fits this post.

While preparing my next short stay in the most beautiful city in the world (obviously this is a purely subjective claim, but based on my eurocentric architectural taste Prague beats other dream spots like Vienna, Vilnius or Florence, and unfortunately I have not had a chance to visit Saint Petersburg yet), I've gone through some really interesting findings that I'd like to share here.

From previous visits I already had a limited knowledge about the Prague Spring events and figures like Jan Palach. To my astonishment, he was not the only person to set himself on fire to protest against the Soviet invasion, his radical protest was followed by others like Jan Zajick and Evzen Porcek. Indeed, wikipedia has a painfully long list of people that decided to carry out such a radical (and to my mind fruitless) means of protest.

Digging a bit more into that tragic spring, one comes across 2 enormous political figures. One of them, František Kriegel can only be labelled as a hero. You must read his biography in wikipedia, but just to awake your interest I'll give a short introduction of my own:

A Jewish guy that had to abandon his natal Galicia (present day Western Ukraine) due to the raising antisemitism, moved to Prague where he would work hard to pay for his medical studies and would become a communist. Afterwards, he moved on to Spain to join the International Brigades to fight against fascism. After the victory of the fascist scum he escaped to France where he was captured and interned in a concentration camp (oh, yes, the treatment given by the French government of that time to the Spanish Antifascist should be enough to make that whole country ashamed for centuries...). After taking part in the WWII, he returned to the new state of Czechoslovakia where he would continue his medical and political work, and as a true communist he would oppose the distorted and tyrannic Stalinist policies...

The other great political figure is the Slovak Alexander Dubček. Again, reading his biography is a real must, but I'll try to whet your appetite with an introduction:

I think it's rather accurate to label him as an idealist. Brought up in this interesting "experiment", the horrors of Stalinism didn't move him away from the Communist Ideals, but I think made him more aware and critical of the failed interpretations conducted in many countries, which would end up leading him to his "socialism with a human face". Unfortunately, this was fiercely repressed from Moscow and what could have given way to a real alternative between Western Capitalism and Eastern distorted pseudo-Communism, brought about 20 years of repression first, the embrace of Capitalism then, and the current collapse of the Western Welfare state...

The WWII was a really hard time for the whole of Czechoslovakia, but I guess it was quite worst in the Czech side, with the annexation of the Sudetenland and occupation of the rest of the country by Nazi Germany. It's hard now when you stroll along the astonishing Prague Castle to imagine the Nazi flag weaving there and the dreadful Heidrich laying out his plans of mass murdering from his office in the castle. The Czech Resistance Movement was particularly angry with the alleged condescendence of part of the Czech population with the occupants, so they chose to undertake a radical plan, the assassination of Heidrich. In principle we could say that their plan worked, as 2 Czech revolutionaries managed to kill the Nazi monster, but the wave of brutal repression brought about by the attack, including the annihilation of 2 Czech villages (Lidice and Lezaki), makes quite difficult to think of this action as a successful idea. It's OK if you want to be a martyr, but how can you grant yourself the right to turn into martyrs those people you intend to protect and liberate? Anyway, seems like this sadist retaliation helped to finish off with that alleged condescendence (now it was not just the Jews and the dissidents who were being murdered, it was ordinary Czech people).
This documentary makes a good introduction to this part of history

Saturday, 23 March 2013

Casting, Conversion and Coercion

Casting, Conversion and Coercion can be confusing sometimes, and more if those discussing about them are rooted in different languages. C#'s "hybrid Casting/Conversion" can be particularly confusing. This is shown by the many questions in StackOverflow about this matter [1], [2]...
My own understanding has changed a bit as of late based on some good readings that I've done.

I used to think of casting (upcasting and downcasting) just in terms of programmers advising/instructing the compiler. With upcasting you mainly tell the compiler what method we want to invoke: "hey, though I'm a Dog, I want to invoke the DoNoise method in my parent (Animal)". With downcasting, we say: "hey, though you think I'm an Animal, indeed I'm a Dog, so do a runtime check to verify that this is true and that you can invoke this Dog specific method. If I'm lying, throw an exception". In C#, this means that the IL generated by the compiler will have a castclass operation, that is, a typechek.

using System;

class Animal
{
 public void DoNoise()
 {
  Console.WriteLine("Animal.DoNoise");
 }
 public virtual void Move()
 {
  Console.WriteLine("Animal.Move");
 }
}

class Dog: Animal
{
 public new void DoNoise()
 {
  Console.WriteLine("Dog.DoNoise");
 }
 
 public override void Move()
 {
  Console.WriteLine("Dog.Move");
 }
 
 public void Bark()
 {
  Console.WriteLine("Dog.Bark");
 }
}

class App
{
 public static void Main()
 {
  Dog d1 = new Dog();
  ((Animal)d1).DoNoise();
  //upcasting and non virtual method, prints: Animal.DoNoise

  ((Animal)d1).Move();
  //prints: Dog.Move
  //as this is a virtual method we have dynamic binding here, so the casting has no effect
  //and it still calls Dog.Move

  Animal a1 = d1;
  ((Dog)a1).Bark();
//prints: Dog.Bark
  //compiler adds a runtime check, so at runtime a1 is verified to be a Dog, and Bark is invoked
 }
}

For me, the above has nothing to do with Conversion, but with asking the compiler to treat an object of allegedly one type as if it were of another type. Thing is that in C#, the casting operator, can be used both for the hinting described above, but also to instruct the compiler to generate real conversions. Notice that c# defines a set of those explicit conversions, and we can define our own implicit (no cast used) or explicit (using the cast syntax) conversions. This feature will give you quite shocking code if you're used to think of casting in terms of subtyping, cause you can end up finding things so odd like this:

class Table
{
    static public explicit operator Dog(Table t)
    {
       return new Dog();
    }
}

....
Dog d = (Dog)new Table();
...

All this is greatly explained by the almighty Eric Lippert in StackOverflow and on his blog:

A "cast" is the usage of a cast operator. A cast operator instructs the compiler that either (1) this expression is not known to be of the given type, but I promise you that the value will be of that type at runtime; the compiler is to treat the expression as being of the given type, and the runtime will produce an error if it is not, or (2) the expression is of a different type entirely, but there is a well-known way to associate instances of the expression's type with instances of the cast-to type. The compiler is instructed to generate code that performs the conversion. The attentive reader will note that these are opposites, which I think is a neat trick.

  • My code has an expression of type B, but I happen to have more information than the compiler does. I claim to know for certain that at runtime, this object of type B will actually always be of derived type D. I will inform the compiler of this claim by inserting a cast to D on the expression. Since the compiler probably cannot verify my claim, the compiler might ensure its veracity by inserting a run-time check at the point where I make the claim. If my claim turns out to be inaccurate, the CLR will throw an exception.
  • I have an expression of some type T which I know for certain is not of type U. However, I have a well-known way of associating some or all values of T with an “equivalent” value of U. I will instruct the compiler to generate code that implements this operation by inserting a cast to U. (And if at runtime there turns out to be no equivalent value of U for the particular T I’ve got, again we throw an exception.)

a casting can generate a conversion (which means the compiler will add code for that) or just nothing (when it's an upcast) or just will add a castclass operation at the IL level, that is, a typechek.

As you can see, my explanation above corresponds just to point (1). So, as Eric mentions in some other answer, we better think of casting just as syntax, a syntax which can mean 2 very different things. Eric calls this dual behaviour a "neat trick", honestly I would call it slightly confusing. If I want to do a conversion from one object to another I would prefer to state it clearly, with something like Convert.DoWhateverConversion

Seems like people tend to use the term conversion for both cases, both when we're just hinting the compiler and no physical conversion of one object into another takes place (Identity conversion), and when an object is transformed into a different object (a double into an int, a string into an int... )

A "conversion" is an operation by which a value of one type is treated as a value of another type -- usually a different type, though an "identity conversion" is still a conversion, technically speaking. The conversion may be "representation changing", like int to double, or it might be "representation preserving" like string to object. Conversions may be "implicit", which do not require a cast, or "explicit", which do require a cast.

All the above is of great help to better fully understand the difference between a cast and the as operator in C#. As explained here, the as operator is only related to the first part of a cast, not to the second one (so when it says "conversion", it's referring to the aforementioned "identity conversions"

The "as" operator only considers reference, boxing and unboxing conversions.

Another concept related to conversions and that I think we've mainly become acquainted with in the JavaScript arena is Coercion. I've read it somewhere and I think the explanation is clear and accurate: A "coercion" is a representation-changing implicit conversion.. This post does a really good job explaining JavaScript coercion. This said, we could call C#'s implicit conversions coercions. Bear in mind that these coercions really involve changing one value from one representation to another, it's not just a hint

short s = 5;
s.GetType();
//System.Int16, so this representation now takes up 2 bytes
int i = s;
i.GetType();
//System.Int32, so this representation now takes up 2 bytes

For a better understanding of all this, I reccomend reading these posts [1] and [2].

Wednesday, 20 March 2013

Primitive types, typeof and Type

I've come across 2 pretty good articles about one of those JavaScript topics that I still find confusing, so I'll write up here some notes for further reference.

Once we have embraced the idea that JavaScript is profoundly Object Based (functions are objects, "inheritance" chain is based on other objects ([[Prototypes]]...), it can come as a shock to learn that not everything in JavaScript is an Object. Same as with Java, we have Primitive Types, that are not real objects (I guess for performance reasons), and same as in Java, these values get automagically boxed (and unboxed) into real Objects when needed. Notice that C#'s equivalent for this are Value types (the Primitive types nomenclature is only used to refer to that subset of value types that have an alias, like int for Integer, bool for Boolean...) and goes one step further by allowing us to declare new Value types (structs).

So, as this post nicely explains we have 5 primitive types:

In JavaScript there are 5 primitive types: undefined, null, boolean, string and number. Everything else is an object. The primitive types boolean, string and number can be wrapped by their object counterparts. These objects are instances of the Boolean, String and Number constructors respectively.

Bear in mind that null and undefined are different from the other primitives as they lack of equivalent Objects to be boxed into and we can't get or set properties on them. I like the nomenclature used for them in this other excellent article that refers to them (along with 0, false, "" and NAaN) as Falsy Values.

Long in short, strings, bools and numbers have an equivalent Object type, String, Boolean and Number, and the runtime takes care of converting from one to another (box/unbox) when needed, for example, each time we access to a property in one of them "myString".length, true.toString(), (5).toFixed()... So, the methods and properties are in the Boolean, Number and String objects, not in the primitive types. We can verify this:


"aa".__proto__;
//works fine, "aa" gets boxed and then we get the String.prototype

//however
Object.getPrototypeOf("aa");
// throws an exception:
//TypeError: "aa" is not an object

5 == new Number(5);
//true, coercion at work

5 === new Number(5);
//false, identity comparison, o coercion applied

"aa" instanceof Object;
//false

"aa" instanceof String;
//false

new String("aa") instanceof Object;
//true

Related to all this, the typeof operator both helps and adds some confusion. It helps by returning a different string value for each primitive type ("string", "number", "boolean", "undefined", "null") and by returning "object" for any other object. However, it also adds some confusion by returning "function" for functions, instead of "object", something that could lead us to think that functions are primitive types, which is false. Functions are objects (myFunction instanceof Object is true)

>>> typeof true;
"boolean"

>>> typeof 5;
"number"

>>> typeof "aa";
"string"

>>> typeof undefined;
"undefined"

>>> typeof null;
"object"

>>> typeof {};
"object"

>>> typeof new String("aa");
"object"

>>> typeof function(){};
"function"

Just to end up, I'll add that the other day while going through the ES5 documentation (yes, you're right, idiot, get a life and do something more productive... :-D) I ran into the section dedicated to Types. This paragraph fully matches with the Primitive types vs Objects thing.

An ECMAScript language type corresponds to values that are directly manipulated by an ECMAScript programmer using the ECMAScript language. The ECMAScript language types are Undefined, Null, Boolean, String, Number, and Object.

Thursday, 14 March 2013

Singletonize Revisited 2.0

I already posted about "singletonizing" a constructor JavaScript function a while ago. The other day, while going through this beautiful article about the internals of Function.bind, I realized of some limitations of my previous approach. The native implementation of Function.bind is internally more powerful than its JavaScript implementations (prototype.js, es5 shim), as it creates a special function Object with some missing, new and overriden internal properties. One point to note is that contrary to what happens with the JavaScript implementations (which are returning anonymous functions) the name property of a bound function will be the same as that of the original function.

function saySomething(){
    console.log(this.name + " says something");
}

var boundSaySomething = saySomething.bind({name: "xuan"});
console.log(boundSaySomething());
//prints: "xuan says something"

console.log(saySomething.name);
//prints "saySomething"

console.log(boundSaySomething.name);
//prints "saySomething"

In my previous singletonizer, I was also creating an anonymous function, so its name was an empty string. The name property is not writable:

var f1 = function(){
    console.log("hi");
};

console.log(f1.name);
//prints: ""

var nameDescript = Object.getOwnPropertyDescriptor(f1, "name");
Object.keys(nameDescript).forEach(function(key){
    console.log(key + ": " + nameDescript[key]);
});

/* prints:
configurable: false
enumerable: false
value: 
writable: false
*/

so we have no way to to set it once the function has been created. Then, the question is, how do we create a named function with a name that is only known at runtime? One option would be using new Function, but functions created that way do not trap their lexical scope (taken from MDN):

Note: Functions created with the Function constructor do not create closures to their creation contexts; they always run in the window context (unless the function body starts with a "use strict"; statement, in which case the context is undefined).

Well, the only option I can think of is using eval. OK, sure you've heard that eval is evil. Let's clarify this, eval is evil when you are evaluating a third party string (either typed by the user, or returned by some server code not controlled by you), but there's nothing wrong with using eval (performance aside) with a string that is under your control (think of eval as a form of the "compiler as a service" that mono and Roslyn implement). So, we end up replacing this:

var singletonized = function(){ ...

with this:

eval ("var singletonizedFunc = function " + initialFunc.name + "(){" ...

I've also improved a bit my old approach by adding a desingletonize method and also taking into account the case when the constructor function returns a value

You can find the code here.

Sunday, 10 March 2013

Dependency Injection, Constructors and so on

The other day I came up again with one of those many almost philosophical matters programmers have to tackle so often (sure many come to mind: method vs getter, Constructors vs Create methods, Constructors vs Initialize methods, are Singletons an antipattern?...) this time the reason for my musings was Constructor Injection vs Setter Injection.

To start off, I think we should avoid by all means the chances of creating objects in a non usable state. I mean, you can't guarantee that clients of your classes will follow all the necessary initialization steps for your objects beyond the constructor call itself, so constructors should give you "usable enough" objects, so that they won't crash when methods are invoked on them.

This said, it seems like we should favor construction injection. One problem though, is that we can end up with constructors with huge lists of parameters, which make the code quite difficult to understand. Once your object is no longer magically responsible for creating its dependencies (logger, validator, dao, profiler...) and these have to be provided, they end up in a long parameter list. How to shorten such list is quite a common question. A usual advice is rethinking your design, as so many dependencies could mean that your object is taking over more than one single responsibility. Sure many times this is true, but there are cases where your cohesive object still needs that bunch of parameters. This guy puts it quite well.

Generally I've found if there's more than 3, that's a sign to do a quick sanity check on the design. If there's more than 5, that's a major warning that something is probably wrong with the design.

However, note the word "probably" - in the end, the only real rule is use as many as needed to function, no more and no less. There are always exceptions and cases where more parameters makes the most sense.

A way to reduce the number of parameters is analyzing which of them are related enough to put them into a new class. This is a common refactoring and has the advantage that probably you'll find some logic that can also go into that Transfer Object. Another usual technique is using a Builder with a fluent API. Well, I'm not much fond of this second approach, indeed, in a language featuring named parameters like C#, I would just opt for using them, in the end it's similar to the options objects that we use in jQuery UI (and JavaScript in general), and I think they make the code clear enough.

An approach with regards to Dependendy Injection that I've put to work as of late and that I quite like, is using a mix of constructor and setter injection. I'll use constructor injection for those parameters that are needed for the main functionality of the class, (the repository for your controller, the connectionstring for your DAO...) while for those functionalities that are sort of an extra (like logging), will use Setter Injection. I like the idea of the Constructor signature indicating only those pieces that are essential to the class.

A problem with Setter Injection is, what if someone decides to skip the IoC and create the object manually, and forgets to set the property? (certainly, you can't forget to invoke a constructor, but can forget about the setter). Well, to be on the safe side we'll need to initialize our setter dependencies with some "usable default", for example using the Null Object Pattern.
One more point for me is that when going through the container I'd like the system to work if these "non essential components" have not been registered (we forget to register a logger for example). With this in mind, we should declare such dependencies as optional. In Unity, this means using the [OptionalDependency] attribute.

When putting together these 2 techniques I've just mentioned, there's something more to take into account. Unity will invoke the constructor and then will do the Setter Injections. For those Optional Dependencies that have not been registered, it'll set it to null. So, to avoid overwriting that safe Null Object that we have previously set, we'll need a check in our setter. All in all:

 public class WebSocketsInterfaceRunner
    {
        private ILogger logger = new NullLogger();

        [OptionalDependency]
        public ILogger Logger
        {
            get
            {
                return this.logger;
            }
            set
            {
                //if defined as optional, the container will set it to null if it finds that it's not defined,
                //so we need this null check
                if (value != null)
                    this.logger = value;
            }
        }

This post shows a mindset rather similar to mine pertaining Dependency Injection.

Saturday, 9 March 2013

Some fresh JavaScript oddities

Anyone who has done a minimum of JavaScript development is aware to a greater or lesser extent of how many oddities, quirks, gotchas... this beautiful language and its base library have. Coercion, Date.getMonth, parseInt, null and undefined... just to name a few. With the advent of ES5, a few more oddities joined the list, bringing some more pleasure and pain to our developer's life. I'll muse, praise and rant on some of them in this post.

Function.bind

When years ago prototype.js brought up this function I really appreciated it, mainly for the lexial this functionality, but over the years I've got so used to creating a closure myself to bind this that I gradually quit using it, so its addition to the ES5 standard went quite unnoticed to me. The other day, while reading this excellent article about the arrow functions coming to ES.next, it caught my attention that ES5 bound functions are a new type of functions (they have extra and overloaded "internal properties" like bound, ... So, they are different from the normal wrapper function that prototype.js or any ES5 shim would return. You can read more about bound functions here.
OK, enough preambles, why am I including Function.bind in this post?

Well, apart from binding this Function.bind also allows for the binding of other parameters, what we commonly know as Partial Function Application. Good, two birds with one stone. Problem is that it does not allow for binding parameters but keeping the dynamic binding for this. If for example you bind a null or undefined value for this:
var func2 = func1.bind(null, "value1";
if you later on invoke the bound function like this:
myObj.func2();
func2 won't get passed myObj, but null...
As I don't see any use in binding null to a function that is really using this I think the correct/useful behaviour would be to consider that when a bound function hosts a null or undefined value in [[BoundThis]], it means that it's not bound.
As unfortunately that's not the case, we would need a separate function providing Partial Function Application (like the one provided by prototype.js and wrongly named curry...). The thing is that such a function does not exist in the standard library and once again we'll need to write our own (which is trivial, but the thing is that such miss tastes to me as having an incomplete feature.

Lack of Object.extend

This functionality is so useful that I really can't understand why it has not been added to the standard library. Sure it's easy to implement your own, or just use this or this, but really, why not add it to the standard?

New Object functions

Most of the new Object functions added in ES5 (freeze, seal, getOwnPropertyDescriptor, defineProperty, keys...) has been added as "static methods" (Object.method) rather than "instance methods" (Object.prototype.method). This has seemed confusing to me since they lifted their (beautiful) head (all these are operations applied to an instance, and I would expect them to belong to the instance). When I finally made my mind to post a question in StackOverflow seeking for others insight, I realized others had already asked and answered it. These 2 points seem key to the explanation:

Cleanly separate the meta and application layers

Try to minimize the API surface area (i.e., the number of methods and the complexity of their arguments)

If we think in terms of .Net or Java and Reflection, "meta operations" are neither instance or static methods of Object, but instance methods of Type/Class, so the JavaScript approach is a bit different, but makes sense to me.
Anyway, even with this explanation in mind, some cases still look confusing to me:

  • Dealing with the internal prototype ([[Prototype]] does not seem to be considered as a meta operation, as we have Object.prototype.isPrototypeOf, but then, why do we have Object.getPrototypeOf? Having something called Object.prototype.getPrototype would look like more consistent to me.
  • I can't see why we have Object.prototype.hasOwnProperty when other property related methods like getOwnPropertyDescriptor, defineProperty are defined as "static" ones.

All this instance vs static thing, has made look back for a while. Years ago, in a period when I was quite much into Python, I was quite puzzled by the presence of a string.upper function. Doing string.upper("my string") rather than "my string".upper() was new and very "old school" to me. This seemed quite unrelated to the well understood decision making for "static method" vs "instance method": clearly operating on a concrete instance.
At the time I found some good explanation, and trying to find it today, I've found the same idea, but expressed from the .Net world.

Developers new to .NET tend to be surprised that calling ToUpper on a string doesn't convert it to upper case. And I think it's perfectly reasonably to be surprised. ToUpper looks like an imperative operation. Its form actively conceals the fact that it's really a function in the strict functional programming sense - it's has no side effects, it merely takes an input expression returns a result based on that input expression without modifying the input in any way. The input expression is the implicit this reference - the object on which ToUpper is invoked.

Thinking about it again now, the rationale behind having Math.sin, Math.cos... rather than Number.prototype.sin, Number.prototype.cos could be in part that (apart from the Single Responsibility Principle and avoiding boxing)