tag:blogger.com,1999:blog-36589303.post7959753022223599511..comments2024-01-22T09:48:10.802+01:00Comments on Nihil Obstat: NY Times: Opinion Mining in Social Networks, TwitterJose Maria Gomez Hidalgohttp://www.blogger.com/profile/17053588779560658723noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-36589303.post-48973362356783971662009-09-01T07:25:26.563+02:002009-09-01T07:25:26.563+02:00Certainly, easiest way to collect "good"...Certainly, easiest way to collect "good" opinions (with context, properly categorized as good, bad, neutral, maybe scaled) is not Twitter but Internet forums, review sites like ciao.es, and of course, blogs like xataka.com. facebook and real Social networks are challenching because the API usually prevents mass data collection, this why I suggest going through brand/product "users"/"groups", that what I am planning after a trial through twitter as a proof of concept, because SNs are the focus of my work for a Spanish OM project which is not for Flax.<br /><br />We shared some time ago, using Freeling, SentimentWordnet, and so: they are still on my "pending" list, as first I will be testing the classical bag of words and putting it into an architecture for my application; then I can go for harder things. Besides, my timeframe for the project is by the end of the year, I have to hurry up! :-)Jose Maria Gomez Hidalgohttps://www.blogger.com/profile/17053588779560658723noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-9467959815486364732009-08-31T20:51:57.083+02:002009-08-31T20:51:57.083+02:00Hi Jose, Thanks a lot for your responses regarding...Hi Jose, Thanks a lot for your responses regarding my 2 comments on opinion mining new software. <br /><br />1.I already did what you told me ablut the ngrams to recognize the language and it works pefectly well, but I was suprised it did not work properly on the online aplications.<br /><br />2.You are completelly right about the reason why they do use twitter mostly, well I implemented opinion minning on blogs also, and started to analize discussion lists to do so as well. I do not find blogs to be very complex if you have a certain expertice with search engines, becausewe had to implement a crawler and a blog ranker (EigenRumors)for this which is not easy at all to do, and also to extract the relevant textual content from the pages, before we could analize the text. <br /><br />Great advice the one about searching information in facebook, It did not occur to me doing it that way.<br /><br />I could infer from your previous post that you are working on the FlaxSentiment module of Flax software. It seems to be extremelly interesting but there is no informatin online about it.<br />Do you mind if we chat a little bit about strategies for doing this? And by any chance you are not doing it in Spanish language are you? because I am, but not only on it.<br /><br />Also Thank you very much for the compliment Jose, you are very kind.Mariana Sofferhttps://www.blogger.com/profile/13351209522681966230noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-47008289889574799782009-08-31T06:55:48.754+02:002009-08-31T06:55:48.754+02:00Besides, I read your post and it looks nice :-)Besides, I read your post and it looks nice :-)Jose Maria Gomez Hidalgohttps://www.blogger.com/profile/17053588779560658723noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-66160721026265319022009-08-31T06:55:04.134+02:002009-08-31T06:55:04.134+02:00Hi Mariana (again)
Regarding concern #1, I see it...Hi Mariana (again)<br /><br />Regarding concern #1, I see it, and what I suggest is that one should perform language identification by themselves, instead of relying even on the Twitter language feature help. That is, Twitter probably bases language identification on the language of the user as stated on preferences and so. I propose to build your own trigram model and ignore Twitter language features.<br /><br />Regarding #2, that the fact. How many apps have you seen that publicly make opinion mining in other networks? People do work on Twitter because it is open, however it is very hard because of the lack of context. In that sense, you are abosolutely right. For more clear opinions, I will be trying to go through facebook to brand sites/users (e.g. Burger King) and collect opinions about posts releasing new services, products and so.<br /><br />Finally, yes I am working on Opinion Mining for a work-related project, I can release details about it but you could guess the applications from a previous post to this blog.Jose Maria Gomez Hidalgohttps://www.blogger.com/profile/17053588779560658723noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-53302382945441054472009-08-28T11:00:59.486+02:002009-08-28T11:00:59.486+02:00Nice post, several issues regarding it.
1. I know ...Nice post, several issues regarding it.<br />1. I know that with tree grams you can get 99 percent accuracy regarding language identification nevertheless if you do a twitter search, with the twitter api that allows you to do this, and you specify the language in which you are searching, arround 15 percents of the tweets are in the wrong language, I tried it a lot in spanish and I get lots of tweets in portugueese and in italian as well. <br />2. I am suprised that is twitter the place that is used for doing opinion mining mainly, because actually as you can see, the results of the the polarity of the opinion are often wrong, I would said that they have a very low accuracy, I did a program my self, very simple indeed and my accuracy was better than those of the 3 new websites that deal with this subject. The problem anyway with analyzing the tweets is that you have very little context, and the meaning of the text can change highly regarding to it, so it is almost imposible to evaluate something if you do not know what field you are dealing with, for example you can have expressions such as the famous paper where it says the beer is hot, which indeed is a critique and the wine is old which is a compliement for it, but for most of the other products you cualify with those words, the result must be the oposit because being old is generally considered a bad thing.<br />If you want to check I give a brief introduction to this in my last post, is really basic indeed, but for those who do not know anything about the subject might be usefull indeed.<br />http://singyourownlullaby.blogspot.com/2009/08/opinion-mining-and-sentiment-analysis.html<br /><br />Cheers nihil<br />pd: are you into spanish op mi yourself?Mariana Sofferhttps://www.blogger.com/profile/13351209522681966230noreply@blogger.com