Thursday, December 7, 2006

Java Garbage Collection: How it works, How to control it


Java has Garbage Collection. Good thing, yes! But, say with me:


The Garbage Collector deallocates all objects with zero references when it wants!!

Two troubles:

  • Objects with zero references: typically a Java code looks like:
    public void doSomething() {
    Configuration c = new Configuration("test.conf");
    long limit = c.getLimit();

    for(long i = 0; i≤limit ; i++) {
    String s = new String(i);
    System.out.println(s);
    }
    }

    The Configuration instance lives until the method's end. We can write:
    public void doSomething() {
    Configuration c = new Configuration("test.conf");
    long limit = c.getLimit();
    c = null; // good! now c has 0 reference!

    for(long i = 0; i≤limit ; i++) {
    String s = new String(i);
    System.out.println(s);
    }
    }

  • GC deallocates them when it wants: it seems that the String instances will be cleaned when they go out of scope, but it isn't!!.
    The String instances are in a particular state called Invisible State.
    The GC cleans all the object that go out of scope only when the method doSomething() returns!!
    Simply, manually set the reference to null:
    public void doSomething() {
    Configuration c = new Configuration("test.conf");
    long limit = c.getLimit();
    c = null; // good! now c has 0 reference!

    for(long i = 0; i≤limit ; i++) {
    String s = new String(i);
    System.out.println(s);
    s = null; // good! now s has 0 reference!
    }
    }

    Now all our Object are only eligible for cleaning.
    You cannot force garbage collection, just suggest it! You can suggest GC with System.gc(), but this does not guarantee when it will happen.
Interesting links: Sun's Truth About Garbage Collection,

Monday, November 27, 2006

How to Optimize your RSS Feed and Promote your Blog or Site


These are some optimizing rules:

  1. The title should contain important search terms: the title should entice the reader to read on, not mislead them

  2. Add channel description field. It provides an opportunity to expand on the broad theme of the RSS feed

  3. The item titles should be 50-75 characters with spaces

  4. Link text should emphasize keywords: be sure to use keywords in any link text that points back to your website

  5. Don't use Buttons like "Add to Google", "My Yahoo!" and don't use javascript or similar. Make subscribing standard, simpler and compatible with one button only

  6. Include the feed on a personal my.yahoo or my.msn home page: it's the fastest way to have an RSS feed spidered by Yahoo or MSN is to

  7. Give your subscribers easy ways to email, tag, share, and act on the content you publish

  8. Share collected links (or photos) in your feed, too

  9. Geotag your feed: let everyone know your publication location by adding your latitude and longitude to your feed

  10. Add some color to your feed. Place a special image (like a corporate logo) in your feed so that it stands out from the pack when displayed in many popular RSS news readers

These are some promoting rules:
  1. Increase link popularity by submitting the RSS feed: this is the list of RSS Feeds

  2. Republish your feed anywhere. You can for example include it as HTML or banner in your forum signature

  3. Show off your feed circulation using counter. You can show your RSS subscription counter and/or your page access counter

  4. Offer feed updates via email to your biggest fans

  5. Include an auto-discovery tag in the HTML header of each web page

  6. If you want to promote your blog using an article you have tu use permalinks. You can use it also to cross-reference your posts

  7. Use linkbacks to notify when somebody links to one of your documents. You can use pingbacks, refbacks and trackbacks.

  8. Build a good Buzz Marketing: you can register to directory services or build your social network. You can read Buzz Marketing with Blogs by Susannah Gardner or this Business Week article

  9. Use RSS announcers like RSS Announcer or free services like FeedBurner PingShot

The last advice:
subscribe to a web analytics service like ShinyStat™. It's free and you can check what sites link you and what search terms are successful in search engines.
Most of these advices can realized using FeedBurner's services.

Saturday, November 25, 2006

Single-page PDF Linux Commands


I've created a single page PDF containing a Linux Command-Line List.
I think it's very useful to have all commands in a single printable page.

I hope you'll like it. This is the PDF:
LinuxCommandsList.pdf

If you want to modify it, you can edit the source file written in LaTeX.

To build the PDF you have to execute:

latex LinuxCommandsList.tex
dvips -t landscape LinuxCommandsList.dvi
ps2pdf -sPAPERSIZE=a4 LinuxCommandsList.ps

Monday, November 20, 2006

Error testing in your software


When you write methods, you should:

  • throw an exception with public method when preconditions or postconditions of the exposed contract fail

  • use assertions to verify nonpublic method's preconditions and postconditions. For example:
    /* Set coordinates of image i in layout */
    void setImageLayout(const Image& i, const int point[]) {
    /* Preconditions: image already loaded and displayed
    * and coordinates are within bounds */
    assert(i.isLoaded() && i.inLayout());
    assert(point[0] >= 0 && point[0] <= Window::WindowLimit);
    assert(point[1] >= 0 && point[1] <= Window::WindowLimit);

    /* Set layout */
    layout_->setImage(i,point);

    /* Postconditions: image layout updated */
    assert(i.isUpdated());
    }
    Assertions are usually implemented so that they can be enabled or disabled. If assertions are disabled, assertion failures are ignored. When the program is released, they are often disabled (while exceptions of public methods continue their checking work...)

  • use unit test to validate that a particular module of source code is working properly from each modification to the next.
    Extreme Programming states "Code the Unit Test First": in development cycle you'll write tests, code, tests, code, tests, code, ...
    You can use many tools like JUnit, TestNG, CUnit or CPPUnit.

Remember this is important rule: A test is not a unit test if:
  • It talks to the database (try a look for DbUnit)
  • It communicates across the network
  • It touches the file system
  • It can't run at the same time as any of your other
    unit tests
  • You have to do special things to your environment
    (such as editing config files) to run it
  • You are not testing a class in isolation from other
    concrete classes: constructs such as mock objects can assist in separating unit tests (with tools like EasyMock and mockpp)

Tuesday, November 14, 2006

Memory Leaks in C++: important rules

This topic seems obvious, but it isn't!! Many memory leaks errors derive from these errors!!

So, there are some rules you must consider:

  1. In polymorphic base class you must declare destructors virtual. If you don't, the results are undefined. Typically, the derived part of the object is never destroyed (memory leak). The same analysis applies to any class lacking a virtual destructor, including std::string and all the STL container types. You do never inherit from a standard container of from any other class with a non-virtual destructor!!
  2. Destructors should never emit exception. This situation can arise the problem of simultaneously active exceptions: the result is undefined. There are three primary ways to avoid the trouble:
    • Terminate the program calling abort
    • Swallow the exception (but it suppresses that something failed)
    • Provide a regular function that gives the opportunity to react to problems that may arise (i.e., storing the result of destructor in a boolean variable)
  3. Have assignment operators return a reference to *this
  4. To make sure that a resource is always released, we need to put that resource inside an object (a resource manager) whose destructor will automatically release the resource. This idea is often called Resource Acquisition Is Initialization (RAII).
  5. A smart pointer SP is a wrapper for a resource (i.e., std::auto_ptr). It's destructor automatically calls delete on what it points to. Attention: copying auto_ptr set it to null and the copying pointer assumes sole ownership of the resource!
  6. A reference-counting smart pointer RCSP (i.e., std::tr1::shared_ptr) is a smart pointer that keeps track of how many objects point to a particular resource and automatically deletes the resource when nobody is pointing to it any longer. It also supports custom deleters: this prevents the cross-DLL problem and can be used to automatically unlock mutexes.
  7. SP and RCSP use delete, not delete[]. For arrays, look to Boost classes: boost::scoped_array and boost::shared_array.
  8. Initialize SPs in standalone statements (not in complex operations). Failure to do this can lead to subtle resource leaks when eceptions are thrown
  9. By default, C++ passes object by value:
    void viewImage(Image i) {
    i.view();
    }
    This is an expensive operation: function parameters are copies of the actual arguments! To bypass all related contructions and destructions, you have to pass by reference-to-const:
    void viewImage(const Image& i) {
    i.view();
    }
    Passing parameters by reference also avoids the slicing problem (when you don't use virtual function in polymorphic classes).
  10. It's instead more efficient to pass by value than by reference the following items:
    • built-in types (e.g. an int)
    • iterators in STL
    • function objects in STL
  11. Prefer this way to write functions that return a new object:
    inline const Image createImage(const Bitmap& bmp) {
    return Image(bmp.getName(), bmp.getData());
    }
    It's faster than you expected and prevents many errors.

  12. Remember that you incur:
    • the cost of construction when control reaches the variable's definition
    • the cost of destruction when variable goes out of scope
    You should postpone the definition until you have initialization arguments for it.
  13. In loops:
    • it's more efficient define variables outside the loop and make an assignment to it on each loop iteration
    • it's more readable define variables inside the loop. Unless you're dealing with a performance-sensitive part of your code, you should using this approach.
  14. Don't inline everything! Consider the rule of 80-20: a typical program spends 80% of its time executing only 20% of its code. You goal as a software developer is to identify the 20% of your code.
This work comes from the Scott Meyers's book "Effective C++ - Third Edition"
Very interesting the George Belotsky's article.