lundi 28 septembre 2009

Index your hard drive and find duplicates with M-Trees

Truth to be told, I'm not a computer scientist... I'm a developer. I'm not working in research area because I'm too dumb to think, I just prefer to build.
As a developer, I wanted to learn by curiosity AI and I've bought this book in during a trip in China (it was not expensive !!) Artificial Intelligence: A Modern Approachh by Stuart Russell and Peter Norvig.
Out of luck, this book is mathematician oriented, I'm really really too dumb to understand what's going on... (I understood some parts, but my mind had to turn off the creative mode).

AI was too hard, so I wanted to learn Data Mining instead; then I've stackoverflown a little bit.

Then, I amazoned Data Mining: Practical Machine Learning Tools and Techniques by Ian H. Witten , Eibe Frank, and I don't regret it !!!!!
I discovered that Data mining is just another word to say AI, some sections of this book overlap with my last one, the difference is that this book is really for developers, don't wait just buy, if you want ideas, you'll have !
Metric Tree (M-Tree) are not aborded by this book but they mention it, curiousity obliged, I googled.

Quickly, you just have to define what is the distance as defined on wikipedia between two objetcts (not necessarily between two numbers).
And then you can easily search the nearest neighbour of an object. Or the objects contained in a range. And that very very quickly approximatively O(log(n)) where n is the number of objects.

What if these objects where the files on my hard drives, and the distance function was the Hamming distance between these two files ? Yeah, I will easily find similar or duplicate files on my hard drive ! See you, I need to code NOW !

mercredi 23 septembre 2009

Measurement and Leverage

Some times ago, I've read "How to make wealth" of Paul Graham. And I've finally found why I don't enjoy when I code for my company as when I code for fun.
Thanks Paul Graham, I'm sorry that you hate Microsoft so much, but I figured out what is fun for the developer.

The response is "Measurement and Leverage".

When you code for yourself, you have both. I mean that you can measure if what you've done is useless, and if you like it, because... you work for yourself.
Most of the times, when you code something for yourself, either it's a great utility which saves you lots of times so it have an impact on your life, either you have learned something valuable in the process which will change your habits and your way of thinking. That's Leverage, what you do for yourself has impacts on you.

Coding for fun has both measurement and leverage. Contributing to a community, and share what you do increase measurement, a lot of people can tell you how much you suck.

Now, why I don't enjoy what I do at work ? Even if it's technologicaly interesting ? Even when I learn something from it ? Even if I'm in a cool company ?
I thought that the only things I'll love to do at work was stuff for myself : if we tell me to do X, I will create a tool Y to do X for me, this way it's fun because I'm my own user... But recently, I've had a project which wasn't for me, and that I enjoyed. So maybe I'm not that selfish, why did I enjoyed that?

Two things: First, I was sure the user will like what I do and that my dear code will be directly useful to him. That's leverage, my program will change the way the user do things, that's so cool!
The other thing, is that my user was frequently trying new versions of my program and he gave me his feedback, that was measurement.

So from now, I want nothing between the end user and me.
If the user ignores me, I'll ignore the program and seeks users who cares, or maybe I'll create a new program which users care.

The bad things, is that from what I've seen, when you are an employee, most of the times you loose leverage and measurement. The "team" do something and the "manager" will tell you if the overall program is good. In this case, you've lost everything.

So now, I'm very excited, I will create a startup with Vincent. This adventure will be a great playground to test this theory : "Making stuffs in collaboration with the end user is fun!".
I don't know where it will bring us, future will tell us.

lundi 21 septembre 2009

Wcf over Twitter



Yeah, I know that's a crazy idea... but I know geeks will like it ! You will be able to send SOAP message over Twitter via WCF !!! You'll learn how to create your own TransportBindingElement.
So just go take a look on the project page on codeplex.
And as always, if you are interested about the inner working, it's on CodeProject.

Duplex MSMQ



How to create duplex communication over MSMQ in WCF ?
That's easy, just take a look on the project home on Codeplex.
If you are interested of the plumbing, go read my article on Codeproject! :)

Crazy coding ideas will come here !

I'm Nicolas Dorier, a french student who just loves to create useful (or not so useful) things in code.
I already have a blog, called Design Mantra in French. Although I'm French, I spend most of my time reading and writing article on english site. So I thought that would be a cool idea to write my blog in English.

After French and C#, my third language is English, and I wish I won't suck... Anyway the goal will be to suck less every day !

My goal is creating interesting piece of code with the concepts I will learn over time.
I'm a .NET developer, so my projects will be mainly in C# with the latest cool technologies...
I want to make my readers to become more creative and give them ideas!

Maybe it will take a little time but definitively, I can't wait !