We consider the numerical solution of multiscale problems for PDEs. Typically, a direct discretization using a grid that resolves all scales will be too expensive, while using a coarse grid will give too inaccurate a discrete approximation. In analogy with classical homogenization theory, we derive an effective (discrete) coarse grid operator whose structure is similar to the one given by direct discretization, but with a locally altered stencil that takes the effect of subgrid scales into account. We show a general procedure for doing this, based on wavelet projections of the discrete fine grid operator followed by sparse approximation. We discuss some theoretical underpinnings of the method and show results from various numerical experiments.