Get Even More Visitors To Your Blog, Upgrade To A Business Listing >>

Issue with decoding characters in API response

Issue with decoding characters in API response

Problem

Having some trouble handling values coming from an external API service with Node.

http.get(
    endpoint,
    function(res)
    {
        if (res.statusCode != 200)
        {
            return next();
        }
        var pageData = '';
        res.setEncoding('utf8');
        res.on(
            'data',
            function(chunk)
            {
                pageData += chunk;
            }
        );
        res.on(
            'end',
            function()
            {
                waterfallCallback(null, pageData);
            }
        );
    }
);

I'm then deserializing the string with a xml2js parser and using the data accordingly. A few of my strings have foreign characters in them ie. Ciné when they get saved to db they don't record correctly.

I've attempted using the iconv package to convert from utf-8 to ISO-8859-1 but I'm not sure if this is the correct way to handle a situation.

Help by an expert is appreciated.

Thanks,

Dave

Problem courtesy of: ddibiase

Solution

I managed to get this going. Lesson learned: always understand the data your dealing with. Not just the format/structure but character encoding is absolutely vital.

The main issue was that characters were being sent as ISO-8859-1 but I had no idea. Once I found that out I converted everything to UTF8 which the database and all my own API endpoints serve up.

I changed my call to use the request node package, pulled it down as binary and used iconv to convert to UTF8. Here's some helpful code:

request(
    {
        uri: 'http://' + endpoint.host + endpoint.path,
        encoding: null
    },
    function(err, response, body)
    {
        if (! err && response.statusCode == 200)
        {
            var iconv = new icon('ISO-8859-1', 'UTF8');
            var converted = iconv.convert(body);

            callback(
                null,
                converted.toString('utf8')
            );
        }
        else
        {
            next();
        }
    }
);

The returned result is now exactly what I'm expecting. =)

Solution courtesy of: ddibiase

Discussion

View additional discussion.



This post first appeared on Node.js Recipes, please read the originial post: here

Share the post

Issue with decoding characters in API response

×

Subscribe to Node.js Recipes

Get updates delivered right to your inbox!

Thank you for your subscription

×