开发者

The reading list for scientific programmer [closed]

开发者 https://www.devze.com 2022-12-10 16:20 出处:网络
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references,or expertise, but this question will likely solicit debate, a
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the he开发者_运维百科lp center for guidance. Closed 10 years ago.

I am working to become a scientific programmer. I have enough background in Math and Stat but rather lacking on programming background. I found it very hard to learn how to use a language for scientific programming because most of the reference for SP are close to trivial.

My work involves statistical/financial modelling and none with physics model. Currently, I use Python extensively with numpy and scipy. Done R/Mathematica. I know enough C/C++ to read code. No experience in Fortran.

I dont know if this is a good list of language for a scientific programmer. If this is, what is a good reading list for learning the syntax and design pattern of these languages in scientific settings.


At some stage you're going to need floating point arithmetic. It's hard to do it well, less hard to do it competently, and easy to do it badly. This paper is a must read:

What Every Computer Scientist Should Know About Floating-Point Arithmetic


I thoroughly recommend

Scientific and Engineering C++: An Introduction with Advanced Techniques and Examples by Barton and Nackman

Don't be put off by its age, it's excellent. Numerical Recipes in your favourite language (so long as it is C,C++ or Fortran) is compendious, and excellent for learning from, not always the best algorithms for each problem.

I also like

Parallel Scientific Computing in C++ and MPI: A Seamless Approach to Parallel Algorithms and their Implementation by Karniadakis

The sooner you start parallel computing the better.


My first suggestion is that you look at the top 5 universities for your specific field, look at what they're teaching and what the professors are using for research. That's how you can discover the relevant language/approach.

Also have a look at this stackoverflow question ("practices-for-programming-in-a-scientific-environment").

You're doing statistical/finance modeling? I use R in that field myself, and it is quickly becoming the standard for statistical analysis, especially in the social sciences, but in finance as well (see, for instance, http://rinfinance.com). Matlab is probably still more widely used in industry, but I have the sense that this may be changing. I would only fall back to C++ as a last resort if performance is a major factor.

Look at these related questions for help finding reading materials related to R:

  • suitable-functional-language-for-scientific-statistical-computing
  • books-for-learning-the-r-language
  • what-can-be-done-in-r-that-cant-be-done-with-python-numpy-scipy
  • r-for-finance-tutorials-resources

In terms of book recommendations related to statistics and finance, I still think that the best general option is David Ruppert's "Statistics and Finance" (you can find most of the R code here and the author's website has matlab code).

Lastly, if your scientific computing isn't statistical, then I actually think that Mathematica is the best tool. It seems to get very little mention amongst programmers, but it is the best tool for pure scientific research in my view. It has much better support for things like integration and partial differential equations that matlab. They have a nice list of books on the wolfram website.


In terms of languages, I think you have a good coverage. Python is great for experimentation and prototyping, Mathematica is good for helping with the theoretical stuff, and C/C++ are there if you need to do serious number crunching.

I might also suggest you develop an appreciation of an assembly language and also a functional language (such as Haskell), not really to use, but rather because of the effect they have on your programming skills and style, and of the concepts they bring home to you. They might also come in handy one day.

I would also consider it vital to learn about parallel programming (concurrent/distributed) as this is the only way to access the sort of computing power that sometimes is necessary for scientific problems. Exposure to functional programming would be quite helpful in this regard, whether or not you actually use a functional language to solve the problem.

Unfortunately I don't have much to suggest in the way of reading, but you may find The Scientist and Engineer's Guide to Digital Signal Processing helpful.


I'm a scientific programmer who just entered the field in the past 2 years. I'm into more biology and physics modeling, but I bet what you're looking for is pretty similar. While I was applying to jobs and internships there were two things that I didn't think would be that important to know, but caused me to end up missing out on opportunities. One was MATLAB, which has already been mentioned. The other was database design -- no matter what area of SP you're in, there's probably going to be a lot of data that has to be managed somehow.

The book Database Design for Mere Mortals by Michael Hernandez was recommended to me as being a good start and helped me out a lot in my preparation. I would also make sure you at least understand some basic SQL if you don't already.


I would suggest any of the numerical recipes books (pick a language) to be useful.

Depending on the languages you use or if you will be doing visualization there can be other suggestions.

Another book I really like is Object-Oriented Implementation of Numerical Methods, by Didier Besset. He shows how to do many equations in Java and smalltalk, but what is more important is that he does a fantastic job with helping to show how to optimize equations for use on a computer and how to deal with errors because of limitations on the computer.


Donald Knuth's book on seminumerical algorithms.


MATLAB is widely used in engineering for design, rapid development, and even production applications (my current project has a MATLAB-generated DLL for doing some advanced number crunching that was easier to do than in our native C++, and our FPGAs use MATLAB-generated cores for signal processing too, which is much easier than coding the same by hand in VHDL). There's also a financial toolbox for MATLAB that may be of interest to you.

This is not to say that MATLAB is the best choice for your field, but at least in engineering, it's widely used and not going anywhere soon.


One issue scientific programmers face is maintaining a repository of code (and data) that others can use to reproduce your experiments. In my experience this is a skill not required in commercial development.

Here are some readings on this:

  • A pipeline is a makefile
  • A Quick Guide to Organizing Computational Biology Projects

These are in the context of computational biology but I assume it applies to most scientific programming.

Also, look at Python Scripting for Computational Science.


Ok here's my list of books that I've been using for the very same purpose:

Numerical Methods for Scientists and Engineers

Numerical Recipes 3rd Edition: The Art of Scientific Computing

CUDA by Example: An Introduction to General-Purpose GPU Programming

Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation)

Parallel Programming in C with MPI and OpenMP

Donald Knuth: Seminumerical Algorithms, Volume 2 of The Art of Computer Programming

Also I found myself using R rather than Python lately.


For generic C++ in scientific enviroments, Modern C++ Design by Andrei Alexandrescu is probably the standard book about the common design patterns.


Once you are up and running, I would strongly recommend reading this blog.

It describes how you use C++ templates to provide type safe units. So for example, if you multiply velocity by time you get a distance etc.


Reading source-code helps a lot, too. Python is great in this sense. I have learnt a great amount of information just by digging through the source codes of scientific Python tools. On top of this following your favourite tools' mailing-lists and forums can enhance your skills further.


this might be useful: the nature of mathematical modeling


Donald Knuth: Seminumerical Algorithms, Volume 2 of The Art of Computer Programming

Press, Teukolsky, Vetterling, Flannery: Numerical Recipes in C++ (the book is great, just beware of the license)

Modern C++ Design

and have a gander at the source code for the GNU Scientific Library.


Writing Scientific Software: A Guide to Good Style is a good book with overall advice for modern scientific programming.


For Java I recommend a look at Unit-API
Implementations are Eclipse UOMo (http://www.eclipse.org/uomo) or JScience.org (work in progress for Unit-API, earlier implementations of JSR-275 exist)

0

精彩评论

暂无评论...
验证码 换一张
取 消