tag:blogger.com,1999:blog-36589303.post5066094176813687387..comments2024-01-22T09:48:10.802+01:00Comments on Nihil Obstat: A Simple Text Classifier in Java with WEKAJose Maria Gomez Hidalgohttp://www.blogger.com/profile/17053588779560658723noreply@blogger.comBlogger65125tag:blogger.com,1999:blog-36589303.post-4673695478581631752024-01-22T09:48:10.802+01:002024-01-22T09:48:10.802+01:00Thank you very nice sharing.Thank you very nice sharing.Mobile App Development Company in Chennaihttps://www.appslure.com/mobile-app-development-company-chennai/noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-292947214024807882021-08-20T09:11:55.940+02:002021-08-20T09:11:55.940+02:00Please be more precise. Which is the problem you a...Please be more precise. Which is the problem you are having?Jose Maria Gomez Hidalgonoreply@blogger.comtag:blogger.com,1999:blog-36589303.post-11733767474461421182021-08-10T10:39:17.317+02:002021-08-10T10:39:17.317+02:00hi jose ,please help me
i am new to weka
unable...hi jose ,please help me <br />i am new to weka <br /><br />unable to work out on this codeAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-36589303.post-49575039994088826192018-07-26T17:09:33.840+02:002018-07-26T17:09:33.840+02:00Hi Jose,
Really thankful there is someone like yo...Hi Jose,<br /><br />Really thankful there is someone like you.<br />I've got one question on Weka, is it that everytime I want to <br />predict something, I need to first train the classifier ?<br />and the model I saved doesn't keep the classifier (eg. something I just load <br />then predict).<br /><br />Regards,SilverCordhttps://www.blogger.com/profile/14637831942977307128noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-85458279598750783122018-02-06T06:50:34.607+01:002018-02-06T06:50:34.607+01:00Hi I am trying to pre process the data by applying...Hi I am trying to pre process the data by applying a filter while loading the data. However, it throws and UnsupportedAttributeType exception when I run train the classifier. <br /><br />The modified load function is as follows :<br /><br />public void loadDataset(String fileName) {<br /> try {<br /> BufferedReader reader = new BufferedReader(new FileReader(fileName));<br /> ArffLoader.ArffReader arff = new ArffLoader.ArffReader(reader);<br /> trainData = arff.getData();<br /><br /> filter = new StringToWordVector();<br /> filter.setAttributeIndices("first-last");<br /> filter.setMinTermFreq(5);<br /> filter.setTokenizer(new WordTokenizer());<br /> filter.setStemmer(new IteratedLovinsStemmer());<br /> filter.setStopwordsHandler(new Rainbow());<br /> filter.setWordsToKeep(100000);<br /> filter.setOutputWordCounts(true);<br /> filter.setIDFTransform(true);<br /> filter.setTFTransform(true);<br /> // generate new data<br /> try {<br /> filter.setInputFormat(trainData);<br /> Instances newData = Filter.useFilter(trainData, filter);<br /> trainData = newData;<br /> System.out.println(newData);<br /> } catch (Exception e) {<br /> e.printStackTrace();<br /> }<br /> System.out.println("===== Loaded dataset: " + fileName + " =====");<br /> System.out.println(trainData);<br /> reader.close();<br /> }<br /> catch (IOException e) {<br /> System.out.println("Problem found when reading: " +e);<br /> }<br /> }<br /><br />Can you help me, in understanding, what I have done wrong ?<br /><br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-36589303.post-86856611123570166642017-10-29T10:22:33.502+01:002017-10-29T10:22:33.502+01:00Yes I am. How can I help you?Yes I am. How can I help you?Jose Maria Gomez Hidalgohttps://www.blogger.com/profile/17053588779560658723noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-58247779974787209772017-10-27T21:48:25.552+02:002017-10-27T21:48:25.552+02:00Hi, Are you still replying to comments on this pos...Hi, Are you still replying to comments on this post ?chamelionhttps://www.blogger.com/profile/17303993506251472218noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-74550441139561029132016-02-10T15:09:20.860+01:002016-02-10T15:09:20.860+01:00hola mi nombre es Abiud leal
me interesa mucho est...hola mi nombre es Abiud leal<br />me interesa mucho este post<br /><br />quiero hacer algo similar, en mi caso quiero entrenar el modelo con un corpus de comentarios, que contiene las siguientes clases: queja, sugerencia, felicitacion.<br /><br />cuando el modelo este entrenado, el comentario que inserte, me tiene que dar el tipo de comentario que escribi.<br /><br />ejemplo: felicidades al cocinero todo estuvo rico. felicitación<br /><br /><br />ya intente ejecutar su programa de pero me marca varios errores.<br /><br />que version de weka utiliza?? me la podria proporcionar porfavor??<br /><br /><br />espero su ayuda<br /><br /><br />muchas graciasAnonymoushttps://www.blogger.com/profile/12371396711125512980noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-42974446242788510702015-11-02T09:18:03.471+01:002015-11-02T09:18:03.471+01:00@Adina - This post explains exactly that. You can ...@Adina - This post explains exactly that. You can apply the same configuration of thje StringToWordVector filter properly to the test set by using a Filtered Classifier.<br /><br />@Kikazz You are right, that code can be factorized into the main function or another "initialization" one. My purpose was to allow you easily delete the function you don't need without loosing the one you need, and at the same time, having all the code for evaluation or training together. But it is better the way you propose.Jose Maria Gomez Hidalgohttps://www.blogger.com/profile/17053588779560658723noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-18125393354368798802015-11-01T17:38:01.988+01:002015-11-01T17:38:01.988+01:00Hello Jose,
This was a really great way for me to...Hello Jose,<br /><br />This was a really great way for me to understand how to get started with Weka. More than with any other tutorial I have come across. A million thanks for this!<br />One question - Your MyFilteredLearner class has an evaluate and a learn method, both of which perform mostly the same steps of initialising/setting options for many of the same variables. Can't this be handled in the main function itself? Or by declaring the classifiers globally and avoiding having to repeat the code in the learn() method?Kikazzhttps://www.blogger.com/profile/02403204111751257400noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-32153155475604842992015-04-16T22:38:35.787+02:002015-04-16T22:38:35.787+02:00Hello. I am new to weka. I read and understood abo...Hello. I am new to weka. I read and understood about classification but i don't understand one thing about testing:<br />I have 4 news categories and i made a arff file, transform with stringtowordvector and classified it.<br />Now i want to test one new text(one news)<br />How am i gonna transform this basic text to a test set?Anonymoushttps://www.blogger.com/profile/14326458847895913932noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-65035018121702293942015-01-09T09:48:59.844+01:002015-01-09T09:48:59.844+01:00Hi, Anonymous
Well, if you are following exactly ...Hi, Anonymous<br /><br />Well, if you are following exactly the instructions and using the file format and right WEKA version, I cannot guess what is wrong, as it works for me.<br /><br />My suggestion: pack everything and send it to me by email of put it in dropbox. I will examine it.<br /><br />Regards,<br /><br />JMJose Maria Gomez Hidalgohttps://www.blogger.com/profile/17053588779560658723noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-17013144515086872442015-01-09T09:45:24.063+01:002015-01-09T09:45:24.063+01:00Hi, Tharaka
It is strange, in principle you shoul...Hi, Tharaka<br /><br />It is strange, in principle you should be able to use a model file you have previously saved using the Explorer, with my code, if the Classifier is compatible (same kind of FilteredClassifier with same filters, classifier and so). The name of the file does not matter...<br /><br />I am afraid I cannot provide better guidance if I have not more details...<br /><br />Regards<br /><br />JMJose Maria Gomez Hidalgohttps://www.blogger.com/profile/17053588779560658723noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-63068436731474286352014-12-11T11:23:20.313+01:002014-12-11T11:23:20.313+01:00Hi..im new to Weka and im implementing a movie cla...Hi..im new to Weka and im implementing a movie classifier system based on genres for my project.I have a small question regarding your code. When you uploade the model it seems that you have uploaded "somthing.dat" file. But im uploading "something.model" file previously created and saved using weka explorer.So can you tell me is this the reason why im continuously getting errors in "classify" function?Thank you in advance. tharakakeerthihttps://www.blogger.com/profile/04074219722545157473noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-12678718687154433052014-11-26T16:08:35.872+01:002014-11-26T16:08:35.872+01:00Hi, This looks like an excellent demonstration of ...Hi, This looks like an excellent demonstration of how to use Weka with java. But I have unfortunately experienced an issue right at the end:<br /><br />I have copy and pasted your classes and used the example file formats for the training instances and the new instance and I am using the Weka developer version. The classifier is built, learned and evaluated correctly. But when I run the MyFilteredClassifier methods to load instance, load model, make instance and classify it fails to classify the instance? I get the following error: No output instance format defined<br /><br />This is the single line of my instance file: <br />this is spam or not, who knows? <br /><br />This is the start of my train ARFF file:<br />@relation sms_test<br /><br />@attribute spamclass {spam,ham}<br />@attribute text String<br /><br />@data<br />ham,'Go............................<br /><br />Could you please let me know why this is happening, because I am using the exact code and file formats you have supplied. Thanks in advance.<br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-36589303.post-7523317690672650002014-11-13T06:28:46.360+01:002014-11-13T06:28:46.360+01:00Hola
Sin conocer más detalles de tu instalación, ...Hola<br /><br />Sin conocer más detalles de tu instalación, no puedo estar 100% seguro, pero lo más probable es que se trate de que tenemos versiones distintas de WEKA. En este post he usado la versión de desarrollo, que a la fecha de cuando fue escrito, es la 3.7.1 si no recuerdo mal.<br /><br />Un saludoJose Maria Gomez Hidalgohttps://www.blogger.com/profile/17053588779560658723noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-25454050149180720232014-11-13T03:53:29.011+01:002014-11-13T03:53:29.011+01:00hola he intentaddo correr el codigo pero me marca ...hola he intentaddo correr el codigo pero me marca un error en esta linea:<br />DenseInstance instance = new DenseInstance(2); <br />No se a que se debe el errorAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-36589303.post-32105097211865410732014-10-13T05:07:36.842+02:002014-10-13T05:07:36.842+02:00hola Señor Raul, yo tengo esto,
doble pred = class...hola Señor Raul, yo tengo esto,<br />doble pred = classifier.classifyInstance (instances.instance (0)); <br />System.out.println ("Clase predijo:". + instances.classAttribute () valor ((int) pred));<br /><br /><br />como puedo obtener el porcentaje de error de esta clase que me predice.<br /><br />en la aplicación de weka lo hace, pero como lo en java, ya he intentado con todos los métodos pero ninguno me funcionar, por favor ayuda ... gracias Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-36589303.post-48046339351665043082014-07-29T13:18:51.851+02:002014-07-29T13:18:51.851+02:00Hola, Raúl
La verdad es que no es un tema en el q...Hola, Raúl<br /><br />La verdad es que no es un tema en el que yo sea experto, ya sabes que el Procesamiento del Lenguaje Natural es un campo muy amplio...<br /><br />Mi consejo es que por un lado busques APIs usando la keyword "textmining" en Twitter, donde hay varias, a ver si alguna resuelve tu problema.<br /><br />Por otro lado, deberías buscar "text segmentation" en Google; en una primera búsqueda he obtenido ya algún resultado que habría que investigar más.<br /><br />¡Mucha suerte!Jose Maria Gomez Hidalgohttps://www.blogger.com/profile/17053588779560658723noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-52875440094497255972014-07-29T09:04:08.843+02:002014-07-29T09:04:08.843+02:00Hola Jose, Al ver este articulo me preguntaba...
...Hola Jose, Al ver este articulo me preguntaba...<br /><br />Si existe una api o método ya en este sector de la computación, que te permita agarrar un texto ya sea de una articulo o libro. A fin de clasificar su contenido, en párrafo, titulo, subtitulo.. Básicamente como descomponerlo reconociendo el sentido lógico del mismo texto.<br /><br />De ser así me podrías mencionar alguno o bien recomendar por donde buscar..<br /><br />Te lo pregunto pues por hay estoy investigando algo de esto en mi universidad y me gustaría conocer tu opinión en esta situación.<br /><br />SaludosAnonymoushttps://www.blogger.com/profile/09643098104160702410noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-80087247261594000022014-07-21T13:44:45.541+02:002014-07-21T13:44:45.541+02:00Hi
Can you post in which line you get the error? ...Hi<br /><br />Can you post in which line you get the error? I guess you get it when running <i>MyFilteredClassifier.java</i>, but it works for me with the sample data and WEKA 3.7.9...<br /><br />RgdsJose Maria Gomez Hidalgohttps://www.blogger.com/profile/17053588779560658723noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-77024870449634517532014-07-18T18:53:11.750+02:002014-07-18T18:53:11.750+02:00Hey Jose, thanks for this example.
I tried it but ...Hey Jose, thanks for this example.<br />I tried it but i have a problem. You suggested to switch the methods learn() and evaluate(). I did this and the training and evaluation works. But when I want to classify my own text after that I get the following error:<br /><br />java.lang.NullPointerException: No output instance format defined<br /><br />I didn't see in your code that you set the output format. Do you know wha I have to do?<br /><br /><br />GreetsHNJMhttps://www.blogger.com/profile/17226527797252666861noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-87684676008374734172014-05-21T08:02:20.701+02:002014-05-21T08:02:20.701+02:00I am afraid that the output is not very informativ...I am afraid that the output is not very informative, so I cannot help you with this unless I have more information. In particular, a short sample of the training and testing files may be enough - however it is required that you describe the process for generating the model with more detail: you just used the Explorer? Which version? Which model (classifier)? Etc.Jose Maria Gomez Hidalgohttps://www.blogger.com/profile/17053588779560658723noreply@blogger.comtag:blogger.com,1999:blog-36589303.post-79467255680365685992014-04-30T17:13:22.908+02:002014-04-30T17:13:22.908+02:00can you please check my error :)?can you please check my error :)?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-36589303.post-83138802752696693472014-04-25T00:04:46.061+02:002014-04-25T00:04:46.061+02:00thank you for your reply . how can i know if they ...thank you for your reply . how can i know if they are not compatible ? i build them using WEKA tool not your MyFilteredLearner.java , dose this cause the problem ? <br /> <br />Also, i have replaced the line #121 and i got this error <br /><br />java.lang.IndexOutOfBoundsException: Index: 0, Size: 0<br /> at java.util.ArrayList.rangeCheck(ArrayList.java:604)<br /> at java.util.ArrayList.get(ArrayList.java:382)<br /> at weka.core.Instances.attribute(Instances.java:341)<br /> at weka.core.AttributeLocator.locate(AttributeLocator.java:153)<br /> at weka.core.AttributeLocator.initialize(AttributeLocator.java:119)<br /> at weka.core.AttributeLocator.(AttributeLocator.java:102)<br /> at weka.core.StringLocator.(StringLocator.java:69)<br /> at weka.filters.Filter.flushInput(Filter.java:431)<br /> at weka.filters.unsupervised.attribute.StringToWordVector.batchFinished(StringToWordVector.java:768)<br /> at weka.classifiers.meta.FilteredClassifier.filterInstance(FilteredClassifier.java:474)<br /> at weka.classifiers.meta.FilteredClassifier.distributionForInstance(FilteredClassifier.java:495)<br /> at weka.classifiers.AbstractClassifier.classifyInstance(AbstractClassifier.java:70)<br /> at myfilteredclassifier.MyFilteredClassifier.classify(MyFilteredClassifier.java:117)<br /> at myfilteredclassifier.MyFilteredClassifier.main(MyFilteredClassifier.java:197)Anonymousnoreply@blogger.com