|
Speaking of unstable databases....
Last post 02-08-2008 2:37 AM by asuffield. 100 replies.
-
-
asuffield


- Joined on 05-31-2006
- Posts 2,137
|
Re: Speaking of unstable databases....
superjer:I'm not even sure what to say to you guys... but multithreading can vastly improve "perceived performance" in certain situations. Right now I'm working on an app that completes 3.99 times faster with 4 threads than with just one. We're talking 4 hours down to 1 hour.
This indicates a bug or design flaw in the application, if all those threads are running on the same processor. You could have done it even faster with a single thread. The fact that you didn't do this does not mean anything. LoztInSpace:Multithreading works well when you have
blocking IO combined with CPU work. Like, for example, a database,
an operating system, a web server to name a few. Any application that
exhibits the IO+CPU+IO will typically benefit from MT. This one falls under the heading of "popular but wrong beliefs". Any application of this form will benefit vastly more from being written by somebody who understands how to do it right in a single thread. This is because the number of threads required in order to unjam the system when using braimdamaged IO will always vastly exceed the number of available processors by several orders of magnitude. Early versions of Java are largely responsible for propagating this belief, because their OS interface library was too broken to support doing it right. This was fixed years ago.
There is precisely one situation in which threading is fundamentally the right solution, and that is to utilise multiple processors in an SMP system. Under any other circumstances there is a faster and better way, even if you don't know what it is.
|
|
-
-
asuffield


- Joined on 05-31-2006
- Posts 2,137
|
Re: Speaking of unstable databases....
superjer:Automatic parallelization is very limited in scope. In a ray-tracer, for example, which is highly parallelizable, the compiler is not going to make it parallel for you. You have to do it yourself with threads. It's the only way to do it.
Actually, researchers have been producing compilers that can do just this for the past couple of decades - it's quite possible (and ray tracers are a popular example for them to demonstrate it with). It's just that people don't use those compilers. There is no particularly good reason for this state of affairs, although everybody involved can come up with a plausible-sounding explanation of why it is not their fault.
|
|
-
-
dlikhten


- Joined on 09-27-2007
- New York Citeyah
- Posts 670
|
Re: Speaking of unstable databases....
Outlaw Programmer:Am I missing something here? What does this have to do with multithreading? If he needed locks to make it threadsafe then he could have used Java's built-in locking mechanism. Seemed to me like he felt the database itself was going to be used by more than 1 program (maybe more than 1 instance of this very program) at a given time, which is why he needed file locks to keep the seperate processes from stepping on each other's toes. Thats what I thought. Why would you not use synchronized blocks in
java if you are using a single-threaded application. The only possible
reason for using a lock file is because of multi-threading or
multi-processing applications. Hence my missconceptions why he shoulda
gone for multithreading... O well... Guess well all implement
single-threading lock file applications like MasterPlanSoftware Edit: Why does he even need locks? If he is completely single threaded there are no reason for locks/synchronizations. Locks are for multi-processes. Synchronizations are for multi-threading... So go xplain that one away my friends.
Code is like a box of chocolates. You never know who stuck a turd in there and why. The Stupidest Man On EarthSSDS Bug: Program should not start up
|
|
-
-
superjer


- Joined on 10-04-2007
- Seattle
- Posts 104
|
Re: Speaking of unstable databases....
Ok I wrote an app called multicore_test to prove my point. And guess what... it works. Surprise. The app sums the first 40,000,000 terms of: 1/sqrt(1) - 1/sqrt(2) + 1/sqrt(3) - 1/sqrt(4) ... With threads it can run on multiple cores and complete in parallel, much faster. The numeric param is the number of threads. Look: $ time ./multicore_test 1
Thread0: 1 to 40000000 Sum is 6.048196e-01
Total sum: 6.048196e-01
real 0m2.271s
user 0m2.265s
sys 0m0.001s
$ time ./multicore_test 4
Thread2: 20000001 to 30000000 Sum is 2.051631e-05
Thread0: 1 to 10000000 Sum is 6.047405e-01
Thread1: 10000001 to 20000000 Sum is 4.631048e-05
Thread3: 30000001 to 40000000 Sum is 1.223015e-05
Total sum: 6.048196e-01
real 0m1.474s
user 0m2.259s
sys 0m0.006s As you can see it is faster (1.474s < 2.271s) with four threads than one. This happens consistently. If you watch the processes in top it even shows CPU use at 199% to indicate that it's using both of my cores. I'm on Fedora 8, btw. assuffield: even if you could make the algorithm faster (admittedly the best strategy in general) you can still always double or quadruple the speed by utilizing your bored, idling cores. In some apps this is crucial, like games and bulk number-crunching. And boinc!! Here is my code: http://www.superjer.com/lies/multicore_test/multicore_test.c http://www.superjer.com/lies/multicore_test/Makefile (uses SDL)
Message encrypted to secure copyrighted data. Any attempt to break the encryption is in violation of the Digital Millenium Copyright Act of the USA (and similar laws worldwide), and is punishable by fines and/or imprisonment.
|
|
-
-
superjer


- Joined on 10-04-2007
- Seattle
- Posts 104
|
Re: Speaking of unstable databases....
Much more impressive on the Intel Core 2 Extreme (4x3.0Ghz):
ubuntu@ubuntu:~$ time ./multicore_test 1
Thread0: 1 to 40000000 Sum is 6.048196e-01
Total sum: 6.048196e-01
real 0m1.403s
user 0m1.404s
sys 0m0.000s
ubuntu@ubuntu:~$ time ./multicore_test 4
Thread0: 1 to 10000000 Sum is 6.047405e-01
Thread2: 20000001 to 30000000 Sum is 2.051631e-05
Thread3: 30000001 to 40000000 Sum is 1.223015e-05
Thread1: 10000001 to 20000000 Sum is 4.631048e-05
Total sum: 6.048196e-01
real 0m0.406s
user 0m1.396s
sys 0m0.004s
1.403s down to just 0.406s! (Ubuntu 7.10)
Message encrypted to secure copyrighted data. Any attempt to break the encryption is in violation of the Digital Millenium Copyright Act of the USA (and similar laws worldwide), and is punishable by fines and/or imprisonment.
|
|
-
-
asuffield


- Joined on 05-31-2006
- Posts 2,137
|
Re: Speaking of unstable databases....
$ time ./multicore_test 1
real 0m1.344s
$ time ./multicore_test 4
real 0m1.351s
$ time ./multicore_test 80
real 0m1.490s
Exactly as predicted. Threading makes performance slightly worse except in one specific scenario, as I described.
|
|
-
-
superjer


- Joined on 10-04-2007
- Seattle
- Posts 104
|
Re: Speaking of unstable databases....
Well it appears you only have 1 CPU core. Correct?
Message encrypted to secure copyrighted data. Any attempt to break the encryption is in violation of the Digital Millenium Copyright Act of the USA (and similar laws worldwide), and is punishable by fines and/or imprisonment.
|
|
-
-
derula


- Joined on 06-15-2007
- Posts 282
|
Re: Speaking of unstable databases....
I'm not an expert on multithreading, but I think the actual question is: Can you translate a non-parallel algorithm into a parallel algorithm, so that the parallel version runs faster on a dual-core with 1 GHz each than the non-parallel version runs on a single-core with 2 GHz? And in fact that's possible for some algorithms, isn't it?
# The Brillant Paula Bean # Author:: derula # Copyright:: Copyright (c) 2005-2008 Paula Bean # License:: Originally distributed under a # proprietary license by Paula Bean. # Ruby port by derula published # under the same terms as Ruby's.
# This module provides a method to return the # brillance of Paula Bean. This can be achieved # by calling the module method get_paula. module Paula_Bean
# This constant stores the essence of Paula # Bean's brillance. This is a private # constant, so you MAY NOT use # Paula_Bean::PAULA to access it! Please use # the get_paula method instead. PAULA = "Brillant"
# Returns Paula Bean's brillance. def self.get_paula return PAULA end end
|
|
-
-
dlikhten


- Joined on 09-27-2007
- New York Citeyah
- Posts 670
|
Re: Speaking of unstable databases....
Ok, I know this will sound like trolling, but please for the love of god listen...TRWTF is that some people here seem to have forgotten about what multi-threading is about. Multi threading is necessary even on single-core cpus... The theory behind multiple threads is "lightweight processes", so then the question is: Why multiprocess? Many on this thread believe it is because more cores = more parrallel processing, the truth is yes that helps but the reason for multi-threading is more basic than that. If you have multiple independent threads then they can all act at the same time thus more cores=faster processing. HOWEVER many programs simply won't get that benefit. The goal of multi threading is that if one thread is waiting due to I/O of any sort, another thread is running and doing all it needs to do while the frist thread is waiting. In other words if I have a program that reads a file, counts to 20, then uses result of both operations, its more efficient to have 1 thread read a file, the other count to 20, and then once they are both done use result. It will work faster because the counting will happen while the file is being read vs having to wait for completion. A log reader that reads multiple log files is a great multi-threading candidate. You set every log file to a separate reader thread. This way you get as much I/O queued up as possible and have an independent thread take all that I/O and process whatever it has. Now having said that, a program made for parallel execution which does not need to constantly stop and wait for other threads to be done will work better if you give it more cores. In such a case an extra core can cut down execution time. But many times that is not the case for many programs. So yes, a 4-thread program on a 4 core cpu which does nothing but computation will run faster than using 1 core. You don't need to prove that to anyone.
Code is like a box of chocolates. You never know who stuck a turd in there and why. The Stupidest Man On EarthSSDS Bug: Program should not start up
|
|
-
-
MasterPlanSoftware


- Joined on 11-10-2006
- Posts 108
|
Re: Speaking of unstable databases....
dlikhten:Ok, I know this will sound like trolling, but please for the love of god listen...TRWTF is that some people here seem to have forgotten about what multi-threading is about. Multi threading is necessary even on single-core cpus... The theory behind multiple threads is "lightweight processes", so then the question is: Why multiprocess? Many on this thread believe it is because more cores = more parrallel processing, the truth is yes that helps but the reason for multi-threading is more basic than that. If you have multiple independent threads then they can all act at the same time thus more cores=faster processing. HOWEVER many programs simply won't get that benefit. The goal of multi threading is that if one thread is waiting due to I/O of any sort, another thread is running and doing all it needs to do while the frist thread is waiting. In other words if I have a program that reads a file, counts to 20, then uses result of both operations, its more efficient to have 1 thread read a file, the other count to 20, and then once they are both done use result. It will work faster because the counting will happen while the file is being read vs having to wait for completion. A log reader that reads multiple log files is a great multi-threading candidate. You set every log file to a separate reader thread. This way you get as much I/O queued up as possible and have an independent thread take all that I/O and process whatever it has. Now having said that, a program made for parallel execution which does not need to constantly stop and wait for other threads to be done will work better if you give it more cores. In such a case an extra core can cut down execution time. But many times that is not the case for many programs. So yes, a 4-thread program on a 4 core cpu which does nothing but computation will run faster than using 1 core. You don't need to prove that to anyone.
I can't believe I am going to say this... but thank you. You have gone back to my original point and I agree.
|
|
-
-
MasterPlanSoftware


- Joined on 11-10-2006
- Posts 108
|
Re: Speaking of unstable databases....
dlikhten: Outlaw Programmer:Am I missing something here? What does this have to do with multithreading? If he needed locks to make it threadsafe then he could have used Java's built-in locking mechanism. Seemed to me like he felt the database itself was going to be used by more than 1 program (maybe more than 1 instance of this very program) at a given time, which is why he needed file locks to keep the seperate processes from stepping on each other's toes. Thats what I thought. Why would you not use synchronized blocks in
java if you are using a single-threaded application. The only possible
reason for using a lock file is because of multi-threading or
multi-processing applications. Hence my missconceptions why he shoulda
gone for multithreading... O well... Guess well all implement
single-threading lock file applications like MasterPlanSoftware Edit: Why does he even need locks? If he is completely single threaded there are no reason for locks/synchronizations. Locks are for multi-processes. Synchronizations are for multi-threading... So go xplain that one away my friends.
And just for reference... I wasn't arguing about how to multithread or anything else. The OP pointed out that in HIS CASE there was no demand to multithread. Therefore a lock was unnecessary. Somehow everyone else got rolled up in this idea of multithreading. I only argued when it was brought up that if he was to use more threads, he would gain performance. That simple statement is not true. It is more complex than that.
|
|
-
-
superjer


- Joined on 10-04-2007
- Seattle
- Posts 104
|
Re: Speaking of unstable databases....
MasterPlanSoftware: dlikhten:... So yes, a 4-thread program on a 4 core cpu which does nothing but computation will run faster than using 1 core. You don't need to prove that to anyone.
I can't believe I am going to say this... but thank you. You have gone back to my original point and I agree.
I've been arguing that multithreaded apps can use multiple cores to complete computations faster, because they're in parallel, of course. You, MasterPlan, have repeated over and over that you can't do parallel computing with multithreading, when in fact it is commonly done on all major, modern OSes. Are you trying to steal my argument now? MasterPlanSoftware:Multithreading cannot make a CPU execute two instructions at the same time.
MasterPlanSoftware:Throwing an extra thread in your program does not make anything run parallel or concurrent.
Message encrypted to secure copyrighted data. Any attempt to break the encryption is in violation of the Digital Millenium Copyright Act of the USA (and similar laws worldwide), and is punishable by fines and/or imprisonment.
|
|
-
-
MasterPlanSoftware


- Joined on 11-10-2006
- Posts 108
|
Re: Speaking of unstable databases....
superjer:Are you trying to steal my argument now? MasterPlanSoftware:Multithreading cannot make a CPU execute two instructions at the same time. Emphasis mine. Reading comprehension my friend.
|
|
-
-
superjer


- Joined on 10-04-2007
- Seattle
- Posts 104
|
Re: Speaking of unstable databases....
Care to add emphasis here? MasterPlanSoftware: superjer:... Multithreading when the threads run in parallel on multiple cores is parallel computing. ...
Sorry. It is NOT parallel computing.
Message encrypted to secure copyrighted data. Any attempt to break the encryption is in violation of the Digital Millenium Copyright Act of the USA (and similar laws worldwide), and is punishable by fines and/or imprisonment.
|
|
-
-
MasterPlanSoftware


- Joined on 11-10-2006
- Posts 108
|
Re: Speaking of unstable databases....
superjer: Care to add emphasis here? MasterPlanSoftware: superjer:... Multithreading when the threads run in parallel on multiple cores is parallel computing. ...
Sorry. It is NOT parallel computing.
(Original emphasis included.) Nope. Throwing threads at a thread scheduler is not parallel computing. Just because there is a chance two threads might get a cpu time slice at the same time does not make this parallel computing.
|
|
-
-
superjer


- Joined on 10-04-2007
- Seattle
- Posts 104
|
Re: Speaking of unstable databases....
There is not just a chance. Given multiple cores, threads will run in parallel practically every time. My program confirms it. It works every time I run it on 2 linuxes, Mac OS X, XP & Vista. It is supposed to work that way.
Your only mistake is underestimating the design of modern operating systems.
Message encrypted to secure copyrighted data. Any attempt to break the encryption is in violation of the Digital Millenium Copyright Act of the USA (and similar laws worldwide), and is punishable by fines and/or imprisonment.
|
|
-
-
MasterPlanSoftware


- Joined on 11-10-2006
- Posts 108
|
Re: Speaking of unstable databases....
superjer:There is not just a chance. Given multiple cores, threads will run in parallel practically every time. My program confirms it. It works every time I run it on 2 linuxes, Mac OS X, XP & Vista. It is supposed to work that way.
Your only mistake is underestimating the design of modern operating systems.
The thread scheduler in Windows will choose a processor to run a new thread on depending on it's current load. If one of your processors is a lot more idle than the other, both threads will run on the same processor.
|
|
-
-
superjer


- Joined on 10-04-2007
- Seattle
- Posts 104
|
Re: Speaking of unstable databases....
I have tested that extensively (mostly on Linux though) and I have found that even if one core is 95% busy, the OS will still use that 5% for my second thread. Every time. Try it yourself. Then when the 1st thread completes (20x faster I suppose) the OS moves my second thread to the now-free core. The result? 10% of the job was completed in parallel. And that's the best it could do! Modern operating systems do an excellent job of using all available resources. For God's sake just try it and see.
Message encrypted to secure copyrighted data. Any attempt to break the encryption is in violation of the Digital Millenium Copyright Act of the USA (and similar laws worldwide), and is punishable by fines and/or imprisonment.
|
|
-
-
MasterPlanSoftware


- Joined on 11-10-2006
- Posts 108
|
Re: Speaking of unstable databases....
superjer:I have tested that extensively (mostly on Linux though) and I have found that even if one core is 95% busy, the OS will still use that 5% for my second thread. Every time. Try it yourself. Then when the 1st thread completes (20x faster I suppose) the OS moves my second thread to the now-free core. The result? 10% of the job was completed in parallel. And that's the best it could do! Modern operating systems do an excellent job of using all available resources. For God's sake just try it and see.
That is a form of load balancing. If it was truly executing the two threads in parallel, and the threads were doing the same amount of work, they would complete at the same time. Everytime.
|
|
-
-
superjer


- Joined on 10-04-2007
- Seattle
- Posts 104
|
Re: Speaking of unstable databases....
If they aren't running in parallel, then how do you propose the computation completes approximately four times faster with four threads than it does with one thread? Keep in mind that the one thread uses "100%" of the CPU and the four use "399%" of the CPU according to top.
Message encrypted to secure copyrighted data. Any attempt to break the encryption is in violation of the Digital Millenium Copyright Act of the USA (and similar laws worldwide), and is punishable by fines and/or imprisonment.
|
|
-
-
MasterPlanSoftware


- Joined on 11-10-2006
- Posts 108
|
Re: Speaking of unstable databases....
superjer:If they aren't running in parallel, then how do you propose the computation completes approximately four times faster with four threads than it does with one thread? Keep in mind that the one thread uses "100%" of the CPU and the four use "399%" of the CPU according to top. I cannot argue about your specific code or results. I am simply stating how these things work. If you achieve different results, it sounds like YOU should investigate. I am sure we would all love to hear the concrete results. Most of your assertions so far have added up to "Every thread you add increases performance.". Unfortunately you are just flat wrong. I have no other answer for you.
|
|
-
-
superjer


- Joined on 10-04-2007
- Seattle
- Posts 104
|
Re: Speaking of unstable databases....
"Every thread you add increases performance." is a ridiculous statement and I did not make it. As I have been saying all along, every thread you can run in parallel on multiple cores increases performance. And OSes allow you to do so easily with multithreading. My program is NOT a special case. It is the norm. Talk to the people who make Photoshop filters, games like Crysis and Supreme Commander, ray-tracers and BSP compilers.
Message encrypted to secure copyrighted data. Any attempt to break the encryption is in violation of the Digital Millenium Copyright Act of the USA (and similar laws worldwide), and is punishable by fines and/or imprisonment.
|
|
-
|
|